Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

Revision as of 17:24, 22 February 2017

System Status

GPC	TCS	Sandy	Gravity	BGQ	File System
P7	P8	KNL	Viz	HPSS

Wed 22 Feb 2017 16:17:00 EST Globus access to HPSS is currently not operational. We hope to have a resolution for this soon.

Wed Feb 22 12:02:09 EST 2017 The HPSS library is back in service.

Tue Feb 21 19:01:08 EST 2017 The robot arm in the HPSS library is stuck, and a support call with the vendor has been opened. All jobs have been suspended until that is fixed, hopefully tomorrow (Wednesday).

Older Messages

Mon Feb 6 16:01:47 EST 2017 Full shutdown of the systems was required, in order to bring up and orderly interconnect and filesystems. Please watch this space for updates.

Mon Feb 6 14:24:51 EST 2017 We're experiencing internal networking problems, which have caused filesystems to become inaccessible on some nodes. Work is underway to fix this.

@@ Line 33: / Line 33: @@
 |}
+<b>Wed 22 Feb 2017 16:17:00 EST</b> Globus access to HPSS is currently not operational.  We hope to have a resolution for this soon.
 <b>Wed Feb 22 12:02:09 EST 2017</b> The HPSS library is back in service.
 <b>Tue Feb 21 19:01:08 EST 2017</B> The robot arm in the HPSS library is stuck, and a support call with the vendor has been opened. All jobs have been suspended until that is fixed, hopefully tomorrow (Wednesday).
+<b>Older Messages</b>
 <b>Mon Feb  6 16:01:47 EST 2017</b>  Full shutdown of the systems was required, in order to bring up and orderly interconnect and filesystems.  Please watch this space for updates.
@@ Line 42: / Line 45: @@
 <b>Mon Feb  6 14:24:51 EST 2017</b>  We're experiencing internal networking problems, which have caused filesystems to become inaccessible on some nodes.  Work is underway to fix this.
-<b>Sat Jan 28 12:54:12 EST 2017</b> GPC and TCS are up, as well as the filesystems.  We have reverted to the old /scratch and /project disks on the GPC, until we can ascertain what was wrong with the new appliance.  In the meantime please submit your jobs as usual.  Also, please help us by cleaning up unnecessary stuff on /scratch.  For TCS users:  the changes we implemented, where you need to use /scratchtcs, are still in effect.
-<b>Sat Jan 28 12:19:16 EST 2017</b>  We are bringing the systems back.  Expect to be ready in a couple of hours.
-<b>Sat 28 Jan 2017 8:41 EST</b> BGQ is not affected and the system is up.
-<b>Sat 28 Jan 2017 8:15 EST</b> Further issues found on the file system; system access to users has been closed until we can solve these issues.
-<b>Fri 27 Jan 2017 15:11:58 EST</b> Cluster network issue is resolved and filesystem access is finally resolved after determining the root cause of the network issue.
-<b> Fri Jan 27 11:20:32 EST 2017 </b> While we're restoring things, file systems will generally not be available for usage during this period, to facilitate our work. Sorry.
-<b> Fri Jan 27 10:02:32 EST 2017 </b> The IB network fabric had a failure earlier today that affected the file systems. The IB fabric is back to normal, and we're working on restoring the file systems at the moment.
-<b> Fri Jan 27 7:34:00 EST 2017 </b> Issues with the new scratch file system; we're investigating.
-<b> Thu Jan 26 21:24:14 EST 2017 </b> Maintenance finished, systems back online and available, with the exception of the TCS, that does not accept jobs yet (but the devel nodes are accessible).
-Jan 25, 2017, 18:48: BGQ is online.
 <!-- [https://support.scinet.utoronto.ca/wiki/index.php/Previous_messages:]  -->

Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

Revision as of 17:24, 22 February 2017

System Status

Navigation menu

Search