Oldwiki.scinet.utoronto.ca:System Alerts

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search

System Status

upGPC upTCS upSandy upGravity downBGQ file system full and showing issues, investigatingFile System
upP7 downP8 downKNL upViz upHPSS


Mon Nov 7 8:00:00 EST 2016 Apparent file system issues. Scratch file system filled up overnight. We are investigating how to mitigate this, and may have to stop jobs.

Fri Oct 28 23:00:00 EDT 2016 The login nodes and devel nodes of the GPC, P7 and BGQ, as well as the datamover nodes, will be rebooted between 2 am and 6 am on Sat Oct 29. Running and queued jobs will not be affected, but interactive sessions will be closed.

Mon Sep 26 10:33:47 EDT 2016 HPSS schedule is back to normal operations.

Sun Sep 25 12:37:12 EDT 2016 Problems resolved. Systems have started coming online. Check the status "lights" above.

Sun Sep 25 10:16:37 EDT 2016 Power outage tripped main breaker and other circuits. Power has been restored to site but there may be an issue with cooling system power that needs to be resolved before any compute systems can be restarted

Sun Sep 25 09:28:15 EDT 2016 Staff enroute to site. After assessing situation will give ETA for recovery.

Sun Sep 25 08:46 EDT 2016 Power outage at datacentre.