Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search
Line 1: Line 1:
== System Status: <span style="color:#33AA33">'''UP'''</span> ==  
+
== System Status: <span style="color:#33AA33">'''UP'' </span> and <span style="color:#FF0000">'''DOWN'' </span> ==  
 +
 
 +
We continue to experience random outages of the system.  Network problems are the latest suspect.  All/most GPC jobs died at around 2:40pm today.
  
The systems are up, but there are still glitches stemming from the transition to CentOS 6.  Please report any problems.
 
  
 
--------------------
 
--------------------
Line 22: Line 23:
 
Let us know if you encounter unexpected behavior due to the transition.
 
Let us know if you encounter unexpected behavior due to the transition.
  
Last updated: Thu Dec  8 10:37:31 EST 2011
+
Last updated: Thu Dec  8 15:06:31 EST 2011
  
  

Revision as of 16:06, 8 December 2011

System Status: UP and DOWN

We continue to experience random outages of the system. Network problems are the latest suspect. All/most GPC jobs died at around 2:40pm today.




We are still encountering problems resulting from the transition to CentOS 6. While we had tested this operating system on a subset of nodes, there are problems when running at large scale, i.e. with almost 4,000 nodes in the GPC cluster.

Please bear with us as we try to fix things. Some of the symptoms are evidenced in the slow (or disappearing) filesystems, sluggish nodes, and general network problems. We'll inform users when we've solved this. In the meantime, please check this space regularly for updates.

Some of the known issues (and workarounds) are listed here.

Thanks for your patience and understanding!

The SciNet Team.


The GPC has been transitioned to CentOS 6 on Monday, December 5, 2011. While this should not have influenced running jobs, unexpectedly, the scratch and home file systems got unmounted on Monday afternoon, killing most jobs. Please resubmit.

Let us know if you encounter unexpected behavior due to the transition.

Last updated: Thu Dec 8 15:06:31 EST 2011



(Previous messages)