Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 20: Line 20:
 
There is no need to do this if you run on IB, or if you run single node mpi jobs on the ethernet (GigE) nodes.  Please check [[GPC_MPI_Versions]] for more details.
 
There is no need to do this if you run on IB, or if you run single node mpi jobs on the ethernet (GigE) nodes.  Please check [[GPC_MPI_Versions]] for more details.
  
Thu Jan 19 11:12:55 EST 2012
 
  
 
([[Previous_messages:|Previous messages]])
 
([[Previous_messages:|Previous messages]])

Revision as of 13:02, 2 March 2012

System Status: UP

Fri Mar 2 11:59:33 EST 2012


Roughly 1/3 of the TCS nodes thermal-checked themselves off ~1140 today due to a glitch in the water supply temperature. Unfortunately, all jobs running on those nodes were lost. Please check your jobs and resubmit if necessary.


Thu Feb 9 11:50:57 EST 2012


System Temporary Change for MPI ethernet jobs:

Due to some changes we are making to the GPC GigE nodes, if you run multinode ethernet MPI jobs (IB multinode jobs are fine), you will need to explicitly request the ethernet interface in your mpirun:

For Openmpi -> mpirun --mca btl self,sm,tcp

For IntelMPI -> mpirun -env I_MPI_FABRICS shm:tcp

There is no need to do this if you run on IB, or if you run single node mpi jobs on the ethernet (GigE) nodes. Please check GPC_MPI_Versions for more details.


(Previous messages)