Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 6: Line 6:
 
[[File:up.png|up|link=BGQ]]BGQ
 
[[File:up.png|up|link=BGQ]]BGQ
 
[[File:up.png|up|link=HPSS]]HPSS
 
[[File:up.png|up|link=HPSS]]HPSS
 +
 +
Tue Jul 30, 19:24:00: <span style="color:red">Downtime announcement</span>
 +
 +
All systems will be shutdown at 8AM on Thurs, 1 Aug for emergency repair
 +
of a component in the cooling system. Systems are expected to be back
 +
on-line in the afternoon. Check here for progress updates.
 +
 +
Apologies for the short notice but we only learned of the problem this
 +
afternoon. We're now attempting to re-schedule other maintenance planned
 +
for later in August to this Thursday as well (hence the uncertainty in
 +
the length of the required downtime).
  
 
Mon Jul 29 10:40:00  All systems back up.
 
Mon Jul 29 10:40:00  All systems back up.

Revision as of 19:25, 30 July 2013

System Status

upGPC upTCS upARC upP7 upBGQ upHPSS

Tue Jul 30, 19:24:00: Downtime announcement

All systems will be shutdown at 8AM on Thurs, 1 Aug for emergency repair of a component in the cooling system. Systems are expected to be back on-line in the afternoon. Check here for progress updates.

Apologies for the short notice but we only learned of the problem this afternoon. We're now attempting to re-schedule other maintenance planned for later in August to this Thursday as well (hence the uncertainty in the length of the required downtime).

Mon Jul 29 10:40:00 All systems back up.

Mon Jul 29 10:09:00 TCS is back up. BGQ still down.

Mon Jul 29 8:37:00 Power glitch overnight took systems down. GPC is already up, and other systems are being brought up.

Wed Jul 24 15:00:00 All BGQ racks back in production

Thu Jul 18 10:00:00 Bgqdev and one of the two bgq racks are up again

Wed Jul 17 17:00:00 Bgqdev and bgq systems are down.

Wed Jul 17 15:58:00 We're reenabling the rack, please resubmit crashed jobs.

Wed Jul 17 15:24:12 One of the two racks of the BlueGene/Q production system has gone down.

Mon Jul 15 09:45:49: Gravity01 (head node in gravity cluster) is down until futher notice. Jobs may still be submitted from devel nodes or arc01

(Previous messages)