Difference between revisions of "Oldwiki.scinet.utoronto.ca:System Alerts"
Jump to navigation
Jump to search
m (→System Status) |
|||
Line 7: | Line 7: | ||
up.png for 100% up | up.png for 100% up | ||
--> | --> | ||
− | [[File: | + | [[File:down.png|scratch file system down|link=GPC Quickstart]]GPC |
− | [[File: | + | [[File:down.png|scratch file system down|link=TCS Quickstart]]TCS |
− | [[File: | + | [[File:down.png|scratch file system down|link=GPU Devel Nodes]]ARC |
− | [[File: | + | [[File:down.png|scratch file system down|link=P7 Linux Cluster]]P7 |
[[File:up.png|up|link=BGQ]]BGQ | [[File:up.png|up|link=BGQ]]BGQ | ||
− | [[File: | + | [[File:down.png|scratch file system down|link=HPSS]]HPSS |
Fri Aug 9 15 32 - /scratch and /project are down. Login and home directories are ok, but no jobs can run, and most of those running will likely die if/when they need to do I/O. | Fri Aug 9 15 32 - /scratch and /project are down. Login and home directories are ok, but no jobs can run, and most of those running will likely die if/when they need to do I/O. |
Revision as of 16:00, 9 August 2013
System Status
Fri Aug 9 15 32 - /scratch and /project are down. Login and home directories are ok, but no jobs can run, and most of those running will likely die if/when they need to do I/O.
Fri Aug 9 15:25 - File system problems. Scratch is unmounted. Jobs are likely dying. We are working on it.
Thu Aug 8 13:22 - most systems are back up
Thu Aug 8 11:18:45 - problems with storage hardware. Trying to resolve with vendor
Thu Aug 8 08:14:01 Cooling has been restored. Starting to recover systems.
Large voltage drop at site knocked-out cooling system at 0558 today. Staff enroute to site.
Last update: Thu 8 Aug 2013 06:12:28 EDT