Oldwiki.scinet.utoronto.ca:System Alerts

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search

System Status

upGPC downTCS upSandy upARC upFile System
downGravity downP7 upBGQ upHPSS


Wed Jan 14 17:02:18 EST: Emergency shutdown of all compute nodes 8:30AM tomorrow (Thurs, 15 Jan). After starting to bring up systems this afternoon we learned that an emergency replacement of the cooling tower fan belt is required tomorrow morning. Compute systems that are currently up will need to be shutdown at 0830 tomorrow. We will attempt to keep login nodes and storage up during tomorrow's downtime which is expected to last 1-4 hrs.

Wed Jan 14 14:34:18 EST: Expect some systems (login nodes, GPC and BGQ) to be available by approx 3:00-3:30PM.

Wed Jan 14 13:09:03 EST: Free-cooling is being restored and should allow compute systems to come online this afternoon. Chiller maintenance will continue throughout the day and possibly into tomorrow. Check back for updates.


SCHEDULED MAINTENANCE DOWNTIME ANNOUNCEMENT

On January 14 and 15, scheduled maintenance on the data centre's cooling system will require all systems to be shut down for at least the first part of the maintenance. All SciNet systems will be shut down at 7 AM on Wednesday January 14, 2015 and all login sessions and jobs will be killed at that time.

At the earliest, the systems will be available again later on Wednesday afternoon, but is it possible that the downtime will extend into Thursday January 15, 2015. Check here on the SciNet wiki (wiki.scinethpc.ca) for updates on Wednesday and Thursday.

--


(Previous messages)