Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
 
(1,000 intermediate revisions by 12 users not shown)
Line 1: Line 1:
 
== System Status==
 
== System Status==
[[File:up.png|up|link=GPC Quickstart]]GPC
+
<!--
[[File:up.png|up|link=TCS Quickstart]]TCS
+
  Notes for updating the system status:
[[File:up.png|up|link=GPU Devel Nodes]]ARC
 
[[File:up.png|up|link=P7 Linux Cluster]]P7
 
[[File:up.png|up|link=BGQ]]BGQ
 
[[File:up.png|up|link=HPSS]]HPSS
 
  
Tue Jul 30, 19:24:00: <span style="color:red">Downtime announcement</span>
+
  -  When removing system status entries, please archive them to:
  
All systems will be shutdown at 8AM on Thurs, 1 Aug for emergency repair
+
    http://wiki.scinethpc.ca/wiki/index.php/Previous_messages:
of a component in the cooling system. Systems are expected to be back
 
on-line in the afternoon. Check here for progress updates.
 
  
Apologies for the short notice but we only learned of the problem this
+
    (yes, the trailing colon is part of the url)
afternoon. We're now attempting to re-schedule other maintenance planned
 
for later in August to this Thursday as well (hence the uncertainty in
 
the length of the required downtime).
 
  
Mon Jul 29 10:40:00  All systems back up.
+
  -  The 'status circles' can be one of the following files:  
  
Mon Jul 29 10:09:00 TCS is back up. BGQ still down.
+
    down.png  for down
 +
    up25.png  for 25% up
 +
    up50.png  for 50% up
 +
    up75.png  for 75% up
 +
    up.png    for 100% up
  
Mon Jul 29 8:37:00  Power glitch overnight took systems down. GPC is already up, and other systems are being brought up.
+
   
 +
{|
 +
|[[File:up.png|up|link=https://docs.scinet.utoronto.ca/index.php/Main_Page]][https://docs.scinet.utoronto.ca Niagara]
 +
|-
 +
|[[File:up.png|up|link=BGQ]][[BGQ]]
 +
|[[File:up.png|up|link=P7 Linux Cluster]][[P7 Linux Cluster|P7]]
 +
|[[File:up.png|up|link=P8]][[P8]]
 +
|-
 +
|[[File:up.png|up|link=SOSCIP_GPU]][[SOSCIP_GPU|SGC]]
 +
|[[File:up.png|up|link=Knights Landing]][[Knights Landing|KNL]]
 +
|[[File:down.png|up|link=HPSS]][https://docs.scinet.utoronto.ca/index.php/HPSS HPSS]
 +
|-
 +
|[[File:up.png|up|]]File System
 +
|[[File:up.png|up|]]External Network
 +
|
 +
|}
  
Wed Jul 24 15:00:00  All BGQ racks back in production
+
-->
  
Thu Jul 18 10:00:00  Bgqdev and one of the two bgq racks are up again
+
System status can now be found at [https://docs.scinet.utoronto.ca docs.scinet.utoronto.ca]
  
Wed Jul 17 17:00:00  Bgqdev and bgq systems are down.
 
  
Wed Jul 17 15:58:00  We're reenabling the rack, please resubmit crashed jobs.
+
<b> Mon 23 Apr 2018 </b> GPC-compute is decommissioned, GPC-storage available until <font color=red><b>30 May 2018</b></font>
  
Wed Jul 17 15:24:12 One of the two racks of the BlueGene/Q production system has gone down.
+
<b> Thu 18 Apr 2018 </b> Niagara system will undergo an upgrade to its Infiniband network between 9am and 12pm, should be transparent to users, however there is a chance of network interruption.
  
Mon Jul 15 09:45:49: Gravity01 (head node in gravity cluster) is down until futher notice. Jobs may still be submitted from devel nodes or arc01
+
<b> Fri 13 Apr 2018 </b> HPSS system will be down for a few hours on <b>Mon, Apr/16, 9AM</b>, for hardware upgrades, in preparation for the eventual move to the Niagara side.
  
([[Previous_messages:|Previous messages]])
+
<b> Tue 10 Apr 2018 </b> Niagara is open to users.
 +
 
 +
<b> Wed 4 Apr 2018 </b> We are very close to the production launch of Niagara, the new system installed at SciNet.
 +
While the RAC allocation year officially starts today, April 4/18, the Niagara system is still undergoing some final tuning and software updates, so the plan is to officially open it to users on next week.
 +
 
 +
All active GPC users will have their accounts, $HOME, and $PROJECT, transferred to the new
 +
Niagara system.  Those of you who are new to SciNet, but got RAC allocations on Niagara,
 +
will have your accounts created and ready for you to login.
 +
 
 +
We are planning an extended [https://support.scinet.utoronto.ca/education/go.php/370/index.php Intro to SciNet/Niagara session], available in person at our office, and webcast on Vidyo and possibly other means, on Wednesday April 11 at noon EST.
 +
 
 +
<!-- [https://support.scinet.utoronto.ca/wiki/index.php/Previous_messages:] -->

Latest revision as of 14:23, 7 May 2018

System Status

System status can now be found at docs.scinet.utoronto.ca


Mon 23 Apr 2018 GPC-compute is decommissioned, GPC-storage available until 30 May 2018

Thu 18 Apr 2018 Niagara system will undergo an upgrade to its Infiniband network between 9am and 12pm, should be transparent to users, however there is a chance of network interruption.

Fri 13 Apr 2018 HPSS system will be down for a few hours on Mon, Apr/16, 9AM, for hardware upgrades, in preparation for the eventual move to the Niagara side.

Tue 10 Apr 2018 Niagara is open to users.

Wed 4 Apr 2018 We are very close to the production launch of Niagara, the new system installed at SciNet. While the RAC allocation year officially starts today, April 4/18, the Niagara system is still undergoing some final tuning and software updates, so the plan is to officially open it to users on next week.

All active GPC users will have their accounts, $HOME, and $PROJECT, transferred to the new Niagara system. Those of you who are new to SciNet, but got RAC allocations on Niagara, will have your accounts created and ready for you to login.

We are planning an extended Intro to SciNet/Niagara session, available in person at our office, and webcast on Vidyo and possibly other means, on Wednesday April 11 at noon EST.