Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 20: Line 20:
 
|
 
|
 
|}
 
|}
 +
 +
Tue Jan 20 14:27:06 EST 2015: At noon on Tuesday January 20th, 2015, both 2-rack BlueGene/Q systems, bgq and bgqdev, will be taken down in order to be merged into one 4-rack system (i.e. 65536 cores).  We expect that the BGQ will be up again some time on Thursday January 22nd, 2015.
  
 
Sat 17 Jan 2015 21:50:40 EST: Cooling has been restored. Systems being restarted. Likely available within an hour or so.  Root cause was a frozen pipe in cooling tower (very strange; has never happened before and today is relatively warm compared to past two weeks).
 
Sat 17 Jan 2015 21:50:40 EST: Cooling has been restored. Systems being restarted. Likely available within an hour or so.  Root cause was a frozen pipe in cooling tower (very strange; has never happened before and today is relatively warm compared to past two weeks).

Revision as of 15:27, 20 January 2015

System Status

upGPC upTCS upSandy upARC upFile System
upGravity upP7 downBGQ upHPSS

Tue Jan 20 14:27:06 EST 2015: At noon on Tuesday January 20th, 2015, both 2-rack BlueGene/Q systems, bgq and bgqdev, will be taken down in order to be merged into one 4-rack system (i.e. 65536 cores). We expect that the BGQ will be up again some time on Thursday January 22nd, 2015.

Sat 17 Jan 2015 21:50:40 EST: Cooling has been restored. Systems being restarted. Likely available within an hour or so. Root cause was a frozen pipe in cooling tower (very strange; has never happened before and today is relatively warm compared to past two weeks).

Sat 17 Jan 2015 19:34:00 EST: JCI on site as well. Diagnosing issue.

Sat 17 Jan 2015 17:33:47 EST: Unusual cooling problem. Systems down. Staff enroute to site


--


(Previous messages)