User Serial
Running serial jobs on SciNet
You should not submit purely serial jobs to the queue (on either GPC or TCS), since it would mean wasting the computational power of 7 (or 31, on the TCS) cpus while the jobs is running. While we encourage you to try and parallelize your code, sometimes it is beneficial to run several serial codes at the same time. Note that because TCS is machine specialized for parallel computing, you should only use the GPC for concurrent serial runs.
Serial jobs of similar duration
So it should be said first that SciNet is a parallel computing resource, and our priority will always be parallel jobs. Having said that, if you can make efficient use of the resources using serial jobs and get good science done, that's good too, and we're happy to help you.
The GPC nodes each have 8 processing cores, and making efficient use of these nodes means using all eight cores. As a result, we'd like to have the users take up whole nodes (eg, run multiples of 8 jobs or more) at a time. The most straightforward way to do this is to bunch the jobs in groups of 8 or more that will take roughly the same amount of time, and create a job that looks a bit like this
<source lang="bash">
- !/bin/bash
- MOAB/Torque submission script for multiple serial jobs on
- SciNet GPC
- PBS -l nodes=1:ppn=8,walltime=1:00:00
- PBS -N serialx8
- DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from
cd $PBS_O_WORKDIR
- EXECUTION COMMAND; ampersand off 8 jobs and wait
(cd jobdir1; ./dojob1) & (cd jobdir2; ./dojob2) & (cd jobdir3; ./dojob3) & (cd jobdir4; ./dojob4) & (cd jobdir5; ./dojob5) & (cd jobdir6; ./dojob6) & (cd jobdir7; ./dojob7) & (cd jobdir8; ./dojob8) & wait </source>
There are four important things to take note of here. First, the wait command at the end is crucial; without it the job will terminate immediately, killing the 8 programs you just started.
Second is that it is important to group the programs by how long they will take. If (say) dojob8 takes 2 hours and the rest only take 1, then for one hour 7 of the 8 cores on the GPC node are wasted; they are sitting idle but are unavailable for other users, and the utilization of this node over the whole run is only 56%. This is the sort of thing we'll notice, and users who don't make efficient use of the machine will have their ability to use scinet resources reduced. If you have many serial jobs of varying length, use the submission script to balance the computational load, as explained here.
Third is that it is necessary to have a good idea of how much memory the jobs will require. The GPC compute nodes have about 14GB in total available to user jobs running on the 8 cores (a bit less, say 13GB, on the devel ndoes gpc01..04). So the jobs also have to be bunched in ways that will fit into 14GB. If that's not possible -- each individual job requires significantly in excess of ~1.75GB -- then its possible in principle to just run fewer jobs so that they do fit; but then, again there is an under-utilization problem. In that case, the jobs are likely candidates for parallelization, and you can contact us at <support@scinet.utoronto.ca> and arrange a meeting with one of the technical analysts to help you do just that.
Fourth is that if the memory requirements allow it, you could run more than 8 jobs at the same time, up to 16, exploting the HyperThreading feature of the Intel Nehalem cores.
Version for more than 8 cores at once (still serial)
If you have hundreds of serial jobs that you want to run concurrently and the nodes are available, then the approach above, while useful, would require tens of scripts to be submitted separately. It is possible for you to request more than one gigE node and to use the following routine to distribute your processes amongst the cores.
<source lang="bash">
- !/bin/bash
- MOAB/Torque submission script for multiple, dynamically-run
- serial jobs on SciNet GPC
- PBS -l nodes=100:ppn=8,walltime=1:00:00
- PBS -N serialdynamicMulti
- DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from
cd $PBS_O_WORKDIR
- FUNCTIONS
function init_dist2nodes {
D2N_AVAIL=($(cat $PBS_NODEFILE|uniq)) D2N_NUM=$(cat $PBS_NODEFILE|uniq|wc -l) D2N_COUNTER=0
}
function dist2nodes {
D2N_SELECTED=$(echo ${D2N_COUNTER}/8|bc) if((${D2N_SELECTED}>${D2N_NUM}-1)); then let "D2N_SELECTED=${D2N_NUM}-1" fi D2N_NODE=${D2N_AVAIL[${D2N_SELECTED}]} let "D2N_COUNTER=D2N_COUNTER+1"
}
- INITIALIZATION
init_dist2nodes mydir=$(pwd)
- MAIN CODE
for((i=1;i<=800;i++)); do
#call dist2nodes to store the name of the next node to run on in the variable D2N_NODE dist2nodes
#here is where you put the command that you will run many times. It could be another script that takes an argument or simply an executable ssh $D2N_NODE "cd ${mydir}; ./my_command.sh $i" &
done wait </source> Notes:
- You can run more or fewer than 8 processes per node by modifying the number 8 in the dist2nodes function.
- Be sure to update the number of nodes asked for with the number of processes per node and the number of processes you will initiate.
- Refer also to notes in the above section.
--cneale 12 May 2010 (UTC)
Serial jobs of varying duration
If you have a lot (50+) of relatively short serial jobs to do, and you know that eight jobs fit in memory without memory issues, the following strategy in your submission script maximizes the cpu utilization: <source lang="bash">
- !/bin/bash
- MOAB/Torque submission script for multiple, dynamically-run
- serial jobs on SciNet GPC
- PBS -l nodes=1:ppn=8,walltime=1:00:00
- PBS -N serialdynamic
- DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from
cd $PBS_O_WORKDIR
- COMMANDS ARE ASSUMED TO BE SCRIPTS CALLED job*, WHICH CALL THE MAIN EXECUTABLE:
psname='myrun'
- EXECUTE COMMANDS
for serialjob in jobs* do
sleep 5 njobs=`ps -C $psname|wc -l` while [ $njobs -gt 8 ] do sleep 5 njobs=`ps -C $psname|wc -l` done $serialjob &
done wait </source> Notes:
- This is the simplest case of dynamically run serial jobs.
- You can run more or fewer than 8 processes per node by modifying the number 8 in the while loop.
- Doing many serial jobs often entails doing many disk reads and writes, which can be detrimental to the performance. In that case, running off the ramdisk may be an option.
- When using a ramdisk, make sure you copy your results from the ramdisk back to the scratch after the runs, or when the job is killed because time has run out.
- More details on how to setup your script to use the ramdisk can be found on the Ramdisk wiki page.
- This script optimizes resource utility, but can only use 1 node (8 cores) at a time. To use more nodes, it could in principle be combined with the script in the previous section.
--Rzon 02:22, 2 April 2010 (UTC)