Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 2: Line 2:
  
 
Oct 22 14:00:00 Logins to the SciNet systems were suddenly disconnected. We are investigating the issue.
 
Oct 22 14:00:00 Logins to the SciNet systems were suddenly disconnected. We are investigating the issue.
 
Oct 19 19:00:00 All systems should be up. Let us know if you still are experiencing difficulties.
 
 
Oct 19 16:20:00 The GPC and TCS have been brought back up. ARC, BGQ, and HPSS are not in operation yet.
 
 
Oct 19 13:05:00 Half of the GPC is being brought up again. TCS, P7, ARC, BGQ, and HPSS are not in operation yet as the chiller control system still needs repairing.
 
 
Oct 19 11:02:48 Staff and technicians on-site have concluded that a chiller control board needs to be replaced. We believe we can bring up the chiller manually now and get a portion of the GPC running by 1PM. The repair work will require a brief chiller shutdown (but no GPC shutdown) later in the day so TCS will stay off for now in order to minimize heat load.
 
 
Oct  18 23:19:04 Still seeing significant voltage fluctuations in facility power. Will keep systems off rather then risk another failure overnight. Sorry for the inconvenience. Expect to be back up by noon tomorrow (possibly earlier)
 
 
Oct  18 22:35:13 Power quality issues brought down the chiller, which required a shutdown of the clusters.  Power and chiller are coming back up, and we hope to have the clusters up by morning.
 
 
Oct  18 21:01:00 The datacentre is down due to a power failure.  We are investigating the problem.
 
  
 
([[Previous_messages:|Previous messages]])
 
([[Previous_messages:|Previous messages]])

Revision as of 14:05, 22 October 2012

System Status: ISSUES

Oct 22 14:00:00 Logins to the SciNet systems were suddenly disconnected. We are investigating the issue.

(Previous messages)