Oldwiki.scinet.utoronto.ca:System Alerts

From oldwiki.scinet.utoronto.ca
Revision as of 19:05, 28 September 2014 by Rzon (talk | contribs) (→‎System Status)
Jump to navigation Jump to search

System Status

upGPC upTCS upSandy upARC upFile System
upGravity upP7 upBGQ downHPSS

Sun Sep 28 20:00:00 EDT 2014: ARC up and accepting jobs. File systems should be fixed too.

Sun Sep 28 18:30:00 EDT 2014: Sandy and Gravity are up. ARC is up but the schedule is not yet operational from arc01 (use gpc nodes to submit). Some filesystem issues with gss (/scratch2 for group using this) on some gpc nodes are being investigated too.

Sun Sep 28 13:59:00 EDT 2014: BGQ up.

Sun Sep 28 13:37:00 EDT 2014: GPC, P7 and TCS are back up. Users will be able to login shortly. All jobs were killed in the event, so please resubmit. We're working on getting the BGQ, Sandy and Gravity systems up too.

Sun Sep 28 09:43:28 EDT 2014: Brief power outage knocked-out cooling system at about 0806 this morning. Cooling has been restored. Disk controllers and filesystems are being brought up. Systems will be unavailable until at least noon.

Fri Sep 19 15:50:54 EDT 2014: Scheduler has been stable for the past hour, and jobs are being scheduled. Please submit your jobs. Please be aware that showq is not reporting some running jobs that were running before the glitch. Use qstat instead of showq for these jobs. Most of queued jobs this morning were rejected by the scheduler when it went back online.


Fri Sep 19 12:52:20 EDT 2014: We've been experiencing intermittent problems with the scheduler. Job submission has been paused temporarily, until we can restart the scheduler. Please check this space for updates.


Mon Sep 15 11:13:10 EDT 2014: The scheduler had some issues this morning and had to be restarted to resolve them. Some queued and running jobs have been lost.


(Previous messages)