Deprecated scripts
The scripts below are older but in principle still working scripts that were on the wiki before. We list them here for users that still want to use them, or used them in the past, as a reference. There are alternatives available for these, however, which are better, less error-prone, and/or more flexible.
Serial jobs of similar duration for more than 8 cores at once
If you have hundreds of serial jobs that you want to run concurrently and the nodes are available, then the approach above (i.e., in the first script on the User Serial page), while useful, would require tens of scripts to be submitted separately. It is possible for you to request more than one gigE node and to use the following routine to distribute your processes amongst the cores.
A better way of doing this can be found on the User Serial page.
<source lang="bash">
- !/bin/bash
- MOAB/Torque submission script for multiple, dynamically-run
- serial jobs on SciNet GPC
- PBS -l nodes=100:ppn=8,walltime=1:00:00
- PBS -N serialdynamicMulti
- DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from
cd $PBS_O_WORKDIR
- FUNCTIONS
function init_dist2nodes {
D2N_AVAIL=($(cat $PBS_NODEFILE|uniq)) D2N_NUM=$(cat $PBS_NODEFILE|uniq|wc -l) D2N_COUNTER=0
}
function dist2nodes {
D2N_SELECTED=$(echo ${D2N_COUNTER}/8|bc) if((${D2N_SELECTED}>${D2N_NUM}-1)); then let "D2N_SELECTED=${D2N_NUM}-1" fi D2N_NODE=${D2N_AVAIL[${D2N_SELECTED}]} let "D2N_COUNTER=D2N_COUNTER+1"
}
- INITIALIZATION
init_dist2nodes mydir=$(pwd)
- MAIN CODE
for((i=1;i<=800;i++)); do
#call dist2nodes to store the name of the next node to run on in the variable D2N_NODE dist2nodes
#here is where you put the command that you will run many times. It could be another script that takes an argument or simply an executable ssh $D2N_NODE "cd ${mydir}; ./my_command.sh $i" &
done wait </source> Notes:
- You need to have the extras modules loaded for the bc command used in this script to work.
- You can run more or fewer than 8 processes per node by modifying the number 8 in the dist2nodes function.
- Be sure to update the number of nodes asked for with the number of processes per node and the number of processes you will initiate.
- Refer also to notes in the above section.
--cneale 12 May 2010 (UTC)
- !!WARNING!!
- With the above script it is extremely important that you know for sure that your runs take almost the same amount of time, because all nodes will wait for the slowest run to finish!
Serial jobs of varying duration on 1 node
If you have a lot (50+) of relatively short serial runs to do, of which the walltime varies, and if you know that eight jobs fit in memory without memory issues, then the following strategy in your submission script maximizes the cpu utilization: <source lang="bash">
- !/bin/bash
- MOAB/Torque submission script for multiple, dynamically-run
- serial jobs on SciNet GPC
- PBS -l nodes=1:ppn=8,walltime=1:00:00
- PBS -N serialdynamic
- DIRECTORY TO RUN - $PBS_O_WORKDIR is directory job was submitted from
cd $PBS_O_WORKDIR
- COMMANDS ARE ASSUMED TO BE SCRIPTS CALLED job*, WHICH CALL THE MAIN EXECUTABLE:
psname='myrun'
- EXECUTE COMMANDS
for serialjob in jobs* do
sleep 5 njobs=`ps -C $psname|wc -l` while [ $njobs -gt 8 ] do sleep 5 njobs=`ps -C $psname|wc -l` done $serialjob &
done wait </source> Notes:
- This is one of the simplest case of dynamically run serial jobs.
- You can run more or fewer than 8 processes per node by modifying the number 8 in the while loop.
- A better way of doing this can be found on the User Serial page.
- Doing many serial jobs often entails doing many disk reads and writes, which can be detrimental to the performance. In that case, running off the ramdisk may be an option.
- When using a ramdisk, make sure you copy your results from the ramdisk back to the scratch after the runs, or when the job is killed because time has run out.
- More details on how to setup your script to use the ramdisk can be found on the Ramdisk wiki page.
- This script optimizes resource utility, but can only use 1 node (8 cores) at a time. To use more nodes, it could in principle be combined with the script in the previous section. But there is no need to even try to do that yourself, the GNU Parallel utility offers that functionality, see the User Serial page.