Difference between revisions of "Scheduler"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 1: Line 1:
The queueing system used at SciNet is based around the [http://www.clusterresources.com/products/moab-cluster-suite/workload-manager.php Moab Workload Manager].
+
The queueing system used at SciNet is based around Cluster Resources [http://www.clusterresources.com/products/moab-cluster-suite/workload-manager.php Moab Workload Manager].
Moab is used on both the GPC and TCS however Torque] is used as the backend resource manager on the GPC and IBM's [LoadLeveler] is used on the TCS.
+
Moab is used on both the GPC and TCS however [Torque] is used as the backend resource manager on the GPC and IBM's [LoadLeveler] is used on the TCS.
  
 +
This page outlines some of the most common Moab commands with full documentation available from Moab [http://www.clusterresources.com/products/mwm/docs/a.gcommandoverview.shtml here].
  
[http://www.clusterresources.com/products/mwm/docs/a.gcommandoverview.shtml Moab Commands]
+
=== Queue Info===
  
 +
To see all jobs queued on a system use
 +
<pre>
 +
showq
 +
</pre>
 +
 +
Three sections are shown; running, idle, and blocked.  Idle jobs are commonly referred to as queued jobs
 +
as they meet all the requirements, however they are waiting for available resources.  Blocked jobs
 +
are either caused by improper resource requests or more commonly by exceeding a user or groups allowable
 +
resources.  For example if you are allowed to submit 10 jobs and you submit 20, the first 10
 +
jobs will be submitted properly and either run right away or be queued, however the other 10 jobs
 +
will be blocked and the jobs won't be submitted to the queue until one of the first 10 finishes. 
 +
 +
=== Available Resources ===
  
mshow
+
To show how many total nodes are currently free use
showq
+
<pre>
 +
showbf -A
 +
</pre>
 +
To show how many infiniband nodes are free use
 +
<pre>
 +
showbf -f ib
 +
</pre>
 +
 
 +
=== Reservations ===
 +
 
 +
showres
 +
 
 +
=== Job Submission ===
 +
 
 +
==== Interactive ====
 +
 
 +
On the GPC an interactive queue session can be requested using the following
 +
<pre>
 +
qsub -l nodes=2:ppn=8,walltime=1:00:00 -I
 +
</pre>
 +
 
 +
==== Non-interactive (Batch) ====
  
showbf
+
For a non-interactive job submission you require a submission script formatted for the appropriate resource manger. Examples
 +
are provided for the [[GPC_Quickstart#Submitting_A_Batch_Job | GPC]] and [[TCS_Quickstart#Submitting_A_Job | TCS]].
  
qsub
+
=== Job Status ===
  
 
checkjob
 
checkjob
 +
 +
=== Cancel a Job ===
  
 
canceljob
 
canceljob
 +
 +
=== User Stats ===
 +
 +
showstats

Revision as of 14:21, 27 August 2009

The queueing system used at SciNet is based around Cluster Resources Moab Workload Manager. Moab is used on both the GPC and TCS however [Torque] is used as the backend resource manager on the GPC and IBM's [LoadLeveler] is used on the TCS.

This page outlines some of the most common Moab commands with full documentation available from Moab here.

Queue Info

To see all jobs queued on a system use

showq

Three sections are shown; running, idle, and blocked. Idle jobs are commonly referred to as queued jobs as they meet all the requirements, however they are waiting for available resources. Blocked jobs are either caused by improper resource requests or more commonly by exceeding a user or groups allowable resources. For example if you are allowed to submit 10 jobs and you submit 20, the first 10 jobs will be submitted properly and either run right away or be queued, however the other 10 jobs will be blocked and the jobs won't be submitted to the queue until one of the first 10 finishes.

Available Resources

To show how many total nodes are currently free use

showbf -A 

To show how many infiniband nodes are free use

showbf -f ib

Reservations

showres

Job Submission

Interactive

On the GPC an interactive queue session can be requested using the following

qsub -l nodes=2:ppn=8,walltime=1:00:00 -I

Non-interactive (Batch)

For a non-interactive job submission you require a submission script formatted for the appropriate resource manger. Examples are provided for the GPC and TCS.

Job Status

checkjob

Cancel a Job

canceljob

User Stats

showstats