Difference between revisions of "User Serial"

Revision as of 12:00, 16 December 2010

General considerations

So it should be said first that SciNet is a parallel computing resource, and our priority will always be parallel jobs. Having said that, if you can make efficient use of the resources using serial jobs and get good science done, that's good too, and we're happy to help you.

Nonetheless, you should never submit purely serial jobs to the queue (on either GPC or TCS), since this would waste the computational power of 7 (or 31, on the TCS) cpus while the jobs is running. While we encourage you to try and parallelize your code, sometimes it is beneficial to run several serial codes at the same time. Note that because TCS is machine specialized for parallel computing, you should only use the GPC for concurrent serial runs.

The GPC nodes each have 8 processing cores, and making efficient use of these nodes means using all eight cores. As a result, we'd like to have the users take up whole nodes by running 8 jobs or more at once.

When running multiple jobs on the same node, it is imperitive to have a good idea of how much memory the jobs will require. The GPC compute nodes have about 14GB in total available to user jobs running on the 8 cores (a bit less, say 13GB, on the devel ndoes gpc01..04). So the jobs also have to be bunched in ways that will fit into 14GB. If that's not possible -- each individual job requires significantly in excess of ~1.75GB -- then its possible in principle to just run fewer jobs so that they do fit; but then, again there is an under-utilization problem. In that case, the jobs are likely candidates for parallelization, and you can contact us at <support@scinet.utoronto.ca> and arrange a meeting with one of the technical analysts to help you do just that.

If the memory requirements allow it, you could actually run more than 8 jobs at the same time, up to 16, exploiting the HyperThreading feature of the Intel Nehalem cores. It may seem counterintuitive, but running 16 jobs on 8 cores has increased some users overall throughput by 10 to 30 percent.

Serial jobs of similar duration

The most straightforward way to run multiple serial jobs is to bunch the jobs in groups of 8 or more that will take roughly the same amount of time, and create a job that looks a bit like this <source lang="bash">

!/bin/bash
MOAB/Torque submission script for multiple serial jobs on
SciNet GPC
PBS -l nodes=1:ppn=8,walltime=1:00:00
PBS -N serialx8

DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from

cd $PBS_O_WORKDIR

EXECUTION COMMAND; ampersand off 8 jobs and wait

(cd jobdir1; ./dojob1) & (cd jobdir2; ./dojob2) & (cd jobdir3; ./dojob3) & (cd jobdir4; ./dojob4) & (cd jobdir5; ./dojob5) & (cd jobdir6; ./dojob6) & (cd jobdir7; ./dojob7) & (cd jobdir8; ./dojob8) & wait </source>

There are four important things to take note of here. First, the wait command at the end is crucial; without it the job will terminate immediately, killing the 8 programs you just started.

Second is that it is important to group the programs by how long they will take. If (say) dojob8 takes 2 hours and the rest only take 1, then for one hour 7 of the 8 cores on the GPC node are wasted; they are sitting idle but are unavailable for other users, and the utilization of this node over the whole run is only 56%. This is the sort of thing we'll notice, and users who don't make efficient use of the machine will have their ability to use scinet resources reduced. If you have many serial jobs of varying length, use the submission script to balance the computational load, as explained below.

Third, we reiterate that if memory requirements allow it, you should try to run more than 8 jobs at once, with a maximum of 16 jobs.

GNU Parallel

You could also use GNU parallel in your script for the case above.

GNU parallel is a really nice tool to run multiple serial jobs in parallel. It offers essential the same functionality as the above on-node scripts, but with a syntax which is almost that of xargs.

GNU parallel is accessible on the GPC in the module gnu-parallel, which you can load in your .bashrc. <source lang="bash"> module load gnu-parallel </source>

It is easiest to demonstrate the usage of GNU parallel by examples. The script above can be replaced by <source lang="bash">

!/bin/bash
MOAB/Torque submission script for multiple serial jobs on SciNet GPC
PBS -l nodes=1:ppn=8,walltime=1:00:00
PBS -N serialx8

DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from

cd $PBS_O_WORKDIR

EXECUTION COMMAND

parallel -j 8 <<EOF

 cd jobdir2; ./dojob2
 cd jobdir3; ./dojob3
 cd jobdir4; ./dojob4
 cd jobdir5; ./dojob5
 cd jobdir6; ./dojob6
 cd jobdir7; ./dojob7
 cd jobdir8; ./dojob8

EOF </source>

The -j8 parameter sets the number of jobs to run at the same time.

For this particular case, using GNU Parallel or not is a matter of taste. The GNU-Parallel version is a bit more flexible though, since one could give, say, 32 command to the parallel command, which would be executed in bunched of eight automatically. And for the application below, the alternatives to GNU Parallel basically mean writing your own scheduler-within-a-scheduler, which is tricky and error prone.

Serial jobs of varying duration

If you have a lot (50+) of relatively short serial runs to do, of which the walltime varies, and if you know that eight jobs fit in memory without memory issues, then the following strategy in your submission script maximizes the cpu utilization: <source lang="bash">

!/bin/bash
MOAB/Torque submission script for multiple, dynamically-run
serial jobs on SciNet GPC
PBS -l nodes=1:ppn=8,walltime=1:00:00
PBS -N serialdynamic

DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from

cd $PBS_O_WORKDIR

COMMANDS ARE ASSUMED TO BE SCRIPTS CALLED job*

find -name 'jobs*' | parallel -j 8 </source> Notes:

You can run more or fewer than 8 processes per node by modifying parallel's -j8 argument.
Doing many serial jobs often entails doing many disk reads and writes, which can be detrimental to the performance. In that case, running off the ramdisk may be an option.
When using a ramdisk, make sure you copy your results from the ramdisk back to the scratch after the runs, or when the job is killed because time has run out.
More details on how to setup your script to use the ramdisk can be found on the Ramdisk wiki page.
This script optimizes resource utility, but can only use 1 node (8 cores) at a time. The next section addresses how to use more nodes.

Version for more than 8 cores at once (still serial)

If you have hundreds of serial jobs that you want to run concurrently and the nodes are available, then the approach above, while useful, would require tens of scripts to be submitted separately. It is possible for you to request more than one gigE node and to use the following routine to distribute your processes amongst the cores.

!/bin/bash
MOAB/Torque submission script for multiple, dynamically-run
serial jobs on SciNet GPC
PBS -l nodes=25:ppn=8,walltime=1:00:00
PBS -N serialdynamicMulti

DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from

cd $PBS_O_WORKDIR

GNU PARALLEL NEEDS A COMMA SEPARATED LIST: CONSTRUCT FROM $PBS_NODEFILE:

NODES=$(uniq $PBS_NODEFILE|tr \\n ,|sed s/.$//)

Note: this one-liner extracts the unique nodes on separate lines,
then replaces the end-of-lines by commas,
and finally removes the last character (a superfluous comma).

START PARALLEL JOBS

seq 800 | parallel -j8 -S$NODES -W$PWD ./myrun {}

Note:
seq 800 : generates numbers 1 through 800 as input to parallel
-j8 : makes 8 commands run simultaneously on each node
-S$NODES : specifes the nodes to use
-W$PWD : start remote commands in current local directory
./myrun {} : is the command to run, with {} replaced by the input

</source> Notes:

Note that submitting several bunches to single nodes, as in the section above, is a more failsafe way of proceeding, since a node failure would only affect one of these bunches, rather than all runs.
GNU Parallel needs a comma separated list of nodes given to its -S argument. This is constructed from the file $PBS_NODEFILE (which contains all nodes assigned to the job, with each node duplicated 8x for the number of cores on each node).
GNU Parallel can reads lines of input and convert those to arguments in the execution command. The execution command is the last argument given to parallel, with {} replaces by the lines on input.
The -W argument is essential: it sets the working directory on the other nodes, which would default to your home directory if omitted. Since /home is read-only on the compute nodes, you would not like not get any output at all!
We reiterate that if memory requirements allow it, you should try to run more than 8 jobs at once, with a maximum of 16 jobs. You can run more or fewer than 8 processes per node by modifying the -j8 parameter to the parallel command.

More on GNU parallel

The documentation for GNU parallel can be found here and its man page here. The man page is also available on the GPC when the gnu-parallel module is loaded, with the command $ man parallel The main page contains options, such as how to make sure the output is not all scrambled, and examples.

Older scripts

Older scripts, which mimicked some of GNU parallel functionality, can be found on the Deprecated scripts page.

--Rzon 02:22, 2 April 2010 (UTC)

Difference between revisions of "User Serial"

Revision as of 12:00, 16 December 2010

Contents

General considerations

Serial jobs of similar duration

GNU Parallel

Serial jobs of varying duration

Version for more than 8 cores at once (still serial)

More on GNU parallel

Older scripts

Navigation menu

Search

@@ Line 1: / Line 1: @@
-'''Running serial jobs on SciNet'''
+===General considerations===
-You should not submit purely serial jobs to the queue (on either GPC or TCS), since it would mean wasting the computational power of 7 (or 31, on the TCS) cpus while the jobs is running.  While we encourage you to try and parallelize your code, sometimes it is beneficial to run several serial codes at the same time. Note that because TCS is machine specialized for parallel computing, you should only use the GPC for concurrent serial runs.
-__TOC__
-===Serial jobs of similar duration===
 So it should be said first that SciNet is a parallel computing resource,
@@ Line 11: / Line 5: @@
 you can make efficient use of the resources using serial jobs and get
 good science done, that's good too, and we're happy to help you.
+Nonetheless, you should never submit purely serial jobs to the queue (on either GPC or TCS), since this would waste the computational power of 7 (or 31, on the TCS) cpus while the jobs is running.  While we encourage you to try and parallelize your code, sometimes it is beneficial to run several serial codes at the same time. Note that because TCS is machine specialized for parallel computing, you should only use the GPC for concurrent serial runs.
 The GPC nodes each have 8 processing cores, and making efficient use of these
 nodes means using all eight cores.  As a result, we'd like to have the
-users take up whole nodes (eg, run multiples of 8 jobs or more) at a time.  The most
+users take up whole nodes by running 8 jobs or more at once.
-straightforward way to do this is to bunch the jobs in groups of 8 or more that
-will take roughly the same amount of time, and create a job that looks a
+When running multiple jobs on the same node, it is imperitive to have a good idea of how much memory the
+jobs will require.   The GPC compute nodes have about 14GB in total available
+to user jobs running on the 8 cores (a bit less, say 13GB, on the devel ndoes <tt>gpc01..04</tt>).
+So the jobs also have to be  bunched in ways that will fit into 14GB.   If that's not possible --
+each individual job requires significantly in excess of ~1.75GB -- then
+its possible in principle to just run fewer jobs so that they do fit;
+but then, again there is an under-utilization problem.   In that case,
+the jobs are likely candidates for parallelization, and you can contact
+us at [mailto:support@scinet.utoronto.ca <support@scinet.utoronto.ca>] and arrange a meeting with one of the
+technical analysts to help you do just that.
+If the memory requirements allow it, you could actually run more than 8 jobs at the same time, up to 16, exploiting the [[GPC_Quickstart#HyperThreading | HyperThreading]] feature of the Intel Nehalem cores.  It may seem counterintuitive, but running 16 jobs on 8 cores has increased some users overall throughput by 10 to 30 percent.
+===Serial jobs of similar duration===
+The most straightforward way to run multiple serial jobs is to bunch the jobs in groups of 8 or more that will take roughly the same amount of time, and create a job that looks a
 bit like this
 <source lang="bash">
 #!/bin/bash
@@ Line 55: / Line 65: @@
 use the submission script to balance the computational load, as explained [[ #Serial jobs of varying duration | below]].
-Third is that it is necessary to have a good idea of how much memory the
+Third, we reiterate that if memory requirements allow it, you should try to run more than 8 jobs at once, with a maximum of 16 jobs.
-jobs will require.   The GPC compute nodes have about 14GB in total available
-to user jobs running on the 8 cores (a bit less, say 13GB, on the devel ndoes <tt>gpc01..04</tt>).
+===GNU Parallel===
-So the jobs also have to be  bunched in ways that will fit into 14GB.   If that's not possible --
-each individual job requires significantly in excess of ~1.75GB -- then
-its possible in principle to just run fewer jobs so that they do fit;
-but then, again there is an under-utilization problem.   In that case,
-the jobs are likely candidates for parallelization, and you can contact
-us at [mailto:support@scinet.utoronto.ca <support@scinet.utoronto.ca>] and arrange a meeting with one of the
-technical analysts to help you do just that.
-Fourth is that if the memory requirements allow it, you could run more than 8 jobs at the same time, up to 16, exploting the [[GPC_Quickstart#HyperThreading | HyperThreading]] feature of the Intel Nehalem cores.
+You could also use GNU parallel in your script for the case above.
-You could also use GNU parallel in your script for this case; see below.
+GNU parallel is a really nice tool to run multiple serial jobs in
+parallel. It offers essential the same functionality as the above on-node
+scripts, but with a syntax which is almost that of xargs.
-====Version for more than 8 cores at once (still serial)====
+GNU parallel is accessible on the GPC in the module
-If you have hundreds of serial jobs that you want to run concurrently and the nodes are available, then the approach above, while useful, would require tens of scripts to be submitted separately. It is possible for you to request more than one gigE node and to use the following routine to distribute your processes amongst the cores.
+<tt>gnu-parallel</tt>, which you can load in your
+[[Important_.bashrc_guidelines|.bashrc]].
+<source lang="bash">
+module load gnu-parallel
+</source>
+It is easiest to demonstrate the usage of GNU parallel by
+examples. The script above can be replaced by
 <source lang="bash">
 #!/bin/bash
-# MOAB/Torque submission script for multiple, dynamically-run
+# MOAB/Torque submission script for multiple serial jobs on SciNet GPC
-# serial jobs on SciNet GPC
 #
-#PBS -l nodes=100:ppn=8,walltime=1:00:00
+#PBS -l nodes=1:ppn=8,walltime=1:00:00
-#PBS -N serialdynamicMulti
+#PBS -N serialx8
 # DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from
 cd $PBS_O_WORKDIR
-# FUNCTIONS
+# EXECUTION COMMAND
-function init_dist2nodes {
+parallel -j 8 <<EOF
-   D2N_AVAIL=($(cat $PBS_NODEFILE|uniq))
+  cd jobdir2; ./dojob2
-   D2N_NUM=$(cat $PBS_NODEFILE|uniq|wc -l)
+  cd jobdir3; ./dojob3
-   D2N_COUNTER=0
+  cd jobdir4; ./dojob4
-}
+  cd jobdir5; ./dojob5
+  cd jobdir6; ./dojob6
+  cd jobdir7; ./dojob7
+  cd jobdir8; ./dojob8
+EOF
+</source>
-function dist2nodes {
+The <tt>-j8</tt> parameter sets the number of jobs to run at the same time.
-   D2N_SELECTED=$(echo ${D2N_COUNTER}/8|bc)
-   if((${D2N_SELECTED}>${D2N_NUM}-1)); then
-     let "D2N_SELECTED=${D2N_NUM}-1"
-   fi
-   D2N_NODE=${D2N_AVAIL[${D2N_SELECTED}]}
-   let "D2N_COUNTER=D2N_COUNTER+1"
-}
-#INITIALIZATION
+For this particular case, using GNU Parallel or not is a matter of taste.
-init_dist2nodes
+The GNU-Parallel version is a bit more flexible though, since one could give, say, 32 command to the <tt>parallel</tt> command, which would be executed in bunched of eight automatically.
-mydir=$(pwd)
+And for the application below, the alternatives to GNU Parallel basically mean writing your own scheduler-within-a-scheduler, which is tricky and error prone.
-#MAIN CODE
+===Serial jobs of varying duration===
-for((i=1;i<=800;i++)); do
-  #call dist2nodes to store the name of the next node to run on in the variable D2N_NODE
-  dist2nodes
-  #here is where you put the command that you will run many times. It could be another script that takes an argument or simply an executable
-  ssh $D2N_NODE "cd ${mydir}; ./my_command.sh $i" &
-done
-wait
-</source>
-Notes:
-* You need to have the <tt>extras</tt> modules loaded for the <tt>bc</tt> command used in this script to work.
-* You can run more or fewer than 8 processes per node by modifying the number 8 in the dist2nodes function.
-* Be sure to update the number of nodes asked for with the number of processes per node and the number of processes you will initiate.
-* Refer also to notes in the above section.
---[[User:cneale|cneale]] 12 May 2010 (UTC)
-:::'''''!!WARNING!!'''''
-: '''''With the above script it is extremely important that you know for sure that your runs take''''' '''almost the same amount of time,''' '''''because all nodes will wait for the slowest run to finish!'''''
-===Serial jobs of varying duration===
 If you have a lot (50+) of relatively short serial runs to do, '''of which the walltime varies''', and if you know that eight jobs fit in memory without memory issues, then the following strategy in your submission script maximizes the cpu utilization:
 <source lang="bash">
@@ Line 136: / Line 126: @@
 cd $PBS_O_WORKDIR
-# COMMANDS ARE ASSUMED TO BE SCRIPTS CALLED job*, WHICH CALL THE MAIN EXECUTABLE:
+# COMMANDS ARE ASSUMED TO BE SCRIPTS CALLED job*
-psname='myrun'
+find -name 'jobs*' | parallel -j 8
-# EXECUTE COMMANDS
-for serialjob in jobs*
-do
-    sleep 5
-    njobs=`ps -C $psname|wc -l`
-    while [ $njobs -gt 8 ]
-    do
-        sleep 5
-        njobs=`ps -C $psname|wc -l`
-    done
-    $serialjob &
-done
-wait
 </source>
 Notes:
-* This is one of the simplest case of dynamically run serial jobs.
+* You can run more or fewer than 8 processes per node by modifying <tt>parallel</tt>'s <tt>-j8</tt> argument.
-* You can run more or fewer than 8 processes per node by modifying the number 8 in the while loop.
-* You could also use GNU parallel in your script for this case; see below.
 * Doing many serial jobs often entails doing many disk reads and writes, which can be detrimental to the performance. In that case, running off the ramdisk may be an option.
 * When using a ramdisk, make sure you copy your results from the ramdisk back to the scratch after the runs, or when the job is killed because time has run out.
 * More details on how to setup your script to use the ramdisk can be found on the [[User_Ramdisk|Ramdisk wiki page]].
-* This script optimizes resource utility, but can only use 1 node (8 cores) at a time. To use more nodes, it could in principle be combined with the script in the previous section.
+* This script optimizes resource utility, but can only use 1 node (8 cores) at a time. The next section addresses how to use more nodes.
-===Using GNU parallel===
+===Version for more than 8 cores at once (still serial)===
-GNU parallel is a really nice tool to run multiple serial jobs in
+If you have hundreds of serial jobs that you want to run concurrently and the nodes are available, then the approach above, while useful, would require tens of scripts to be submitted separately. It is possible for you to request more than one gigE node and to use the following routine to distribute your processes amongst the cores.
-parallel. It offers essential the same functionality as the above on-node
-scripts, but with a syntax which is almost that of xargs.
-GNU parallel is accessible on the GPC in the module
-<tt>gnu-parallel</tt>, which you may want to load in your
-[[Important_.bashrc_guidelines|.bashrc]].
-It is easiest to demonstrate the usage of GNU parallel by
-examples. The first script above can be replaced by
 <source lang="bash">
 #!/bin/bash
-# MOAB/Torque submission script for multiple serial jobs on SciNet GPC
+# MOAB/Torque submission script for multiple, dynamically-run
+# serial jobs on SciNet GPC
 #
-#PBS -l nodes=1:ppn=8,walltime=1:00:00
+#PBS -l nodes=25:ppn=8,walltime=1:00:00
-#PBS -N serialx8
+#PBS -N serialdynamicMulti
 # DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from
 cd $PBS_O_WORKDIR
-# EXECUTION COMMAND
+# GNU PARALLEL NEEDS A COMMA SEPARATED LIST: CONSTRUCT FROM $PBS_NODEFILE:
-parallel -j 8 <<EOF
+NODES=$(uniq $PBS_NODEFILE|tr \\n ,|sed s/.$//)
-  cd jobdir2; ./dojob2
+# Note: this one-liner extracts the unique nodes on separate lines,
-  cd jobdir3; ./dojob3
+#       then replaces the end-of-lines by commas,
-  cd jobdir4; ./dojob4
+#       and finally removes the last character (a superfluous comma).
-  cd jobdir5; ./dojob5
-   cd jobdir6; ./dojob6
+# START PARALLEL JOBS
-  cd jobdir7; ./dojob7
+seq 800 | parallel -j8 -S$NODES -W$PWD ./myrun {}
-  cd jobdir8; ./dojob8
+#Note:
-EOF
+#  seq 800    : generates numbers 1 through 800 as input to parallel
+#  -j8        : makes 8 commands run simultaneously on each node
+#  -S$NODES   : specifes the nodes to use
+#  -W$PWD     : start remote commands in current local directory
+#  ./myrun {} : is the command to run, with {} replaced by the input
 </source>
-while the third, the load balancing script, can become
+Notes:
-<source lang="bash">
+* Note that submitting several bunches to single nodes, as in the section above, is a more failsafe way of proceeding, since a node failure would only affect one of these bunches, rather than all runs.
-#!/bin/bash
+* GNU Parallel needs a comma separated list of nodes given to its -S argument. This is constructed from the file $PBS_NODEFILE (which contains all nodes assigned to the job, with each node duplicated 8x for the number of cores on each node).
-# MOAB/Torque submission script for multiple, dynamically-run
+* GNU Parallel can reads lines of input and convert those to arguments in the execution command. The execution command is the last argument given to <tt>parallel</tt>, with <tt>{}</tt> replaces by the lines on input.
-# serial jobs on SciNet GPC
+* <span style="color:red;">The -W argument is essential</tt>: it sets the working directory on the other nodes, which would default to your home directory if omitted. Since /home is read-only on the compute nodes, you would not like not get any output at all!
-#
+* We reiterate that if memory requirements allow it, you should try to run more than 8 jobs at once, with a maximum of 16 jobs. You can run more or fewer than 8 processes per node by modifying the -j8 parameter to the parallel command.
-#PBS -l nodes=1:ppn=8,walltime=1:00:00
-#PBS -N serialdynamic
-# DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from
-cd $PBS_O_WORKDIR
-# COMMANDS ARE ASSUMED TO BE SCRIPTS CALLED job*, WHICH CALL THE MAIN EXECUTABLE
+===More on GNU parallel===
-find -name 'jobs*' | parallel -j 8
-</source>
-The documentation for GNU parallel can be found [http://www.gnu.org/software/parallel/  here].  The
+The documentation for GNU parallel can be found [http://www.gnu.org/software/parallel/  here] and its
-man page can be viewed with
+man page [http://www.gnu.org/software/parallel/man.html here]. The man page is also available on the GPC when the gnu-parallel module is loaded, with the command
 <code>$ man parallel</code>
-and contain options, such as how to make sure the output is not all scrambled.
+The main page contains options, such as how to make sure the output is not all scrambled, and examples.
+===Older scripts===
+Older scripts, which mimicked some of GNU parallel functionality, can be found on the [[Deprecated scripts]] page.
 --[[User:Rzon|Rzon]] 02:22, 2 April 2010 (UTC)