Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 18: Line 18:
 
|[[File:up.png|up|link=HPSS]][[HPSS]]
 
|[[File:up.png|up|link=HPSS]][[HPSS]]
 
|}
 
|}
 +
 +
Wed Dec 25 07:15:10 EST 2013: Some TCS jobs were killed at ~7AM today as we shutdown frames 9 and 10 to help stabilize temperatures in the machine room. Please check your jobs and resubmit. The nodes are being restarted
 +
 +
Wed Dec 25 07:15:10 EST 2013: Cooling tower was successfully de-iced and water temperatures have returned to normal.
  
 
Wed Dec 25 06:57:10 EST 2013:  Shutting down some TCS nodes to help lower room temperatures. Cooling tower has frozen over. Trying to get de-icing cycle going again.
 
Wed Dec 25 06:57:10 EST 2013:  Shutting down some TCS nodes to help lower room temperatures. Cooling tower has frozen over. Trying to get de-icing cycle going again.

Revision as of 08:20, 25 December 2013

System Status

upGPC upTCS upSandy upARC
upGravity upP7 downBGQ upHPSS

Wed Dec 25 07:15:10 EST 2013: Some TCS jobs were killed at ~7AM today as we shutdown frames 9 and 10 to help stabilize temperatures in the machine room. Please check your jobs and resubmit. The nodes are being restarted

Wed Dec 25 07:15:10 EST 2013: Cooling tower was successfully de-iced and water temperatures have returned to normal.

Wed Dec 25 06:57:10 EST 2013: Shutting down some TCS nodes to help lower room temperatures. Cooling tower has frozen over. Trying to get de-icing cycle going again.

Sun Dec 22 11:08:13 EST 2013: Another power event at 0312 today knocked out the BGQ again. Unfortunately key staff are without power so time to restore is unknown (more than 250,000 customers in the GTA currently without power)

Sun Dec 22 00:19:23 EST 2013: BGQ up and jobs running. Some may have been killed so check your logs.

Sat Dec 21 23:39:26 EST 2013: Power glitch to site at 2240 caused the BGQ to shutdown - it is being restored. Large ice storm is underway and PowerStream reports over 20,000 customers without power. There may well be more issues overnight.


Last updated: Wed Dec 18 15:59:09 EST 2013

Dear SciNet users:

SciNet is officially on holiday from Sat Dec 21, 2013, until Sun Jan 5, 2014. All systems will be up, and maintained on a best-effort basis. User support will also be on a best-effort basis, though we will try to help if we can.

We wish you all Happy Holidays, and the best for the New Year.

The SciNet team.


(Previous messages)