Oldwiki.scinet.utoronto.ca:System Alerts

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search

System Status: UP

To mitigate some of the file system problems, there will be a relatively short downtime of all SciNet systems on Thursday Feb 9, to perform a reconfiguration. The downtime will start at 9am and is expected to last approximately two hours. Check the here for updates.

Mon Feb 6, 13:06:00 EST 2012


File systems (scratch and home) got unmounted around 3:30 am and again at around 23:15 on Jan/30. Jobs may have crashed.

Filesystems are back now. Please resubmit you jobs.

Mon Jan 31 9:12:00 EST 2012


System Temporary Change:

Due to some changes we are making to the GPC GigE nodes, if you run multinode ethernet MPI jobs (IB multinode jobs are fine), you will need to explicitly request the ethernet interface in your mpirun:

For Openmpi -> mpirun --mca btl self,sm,tcp

For IntelMPI -> mpirun -env I_MPI_FABRICS shm:tcp

There is no need to do this if you run on IB, or if you run single node mpi jobs on the ethernet (GigE) nodes. Please check GPC_MPI_Versions for more details.

Thu Jan 19 11:12:55 EST 2012

(Previous messages)