Previous messages:

From oldwiki.scinet.utoronto.ca
Revision as of 13:32, 26 November 2010 by Jchong (talk | contribs) (added note about down with issue.)
Jump to navigation Jump to search

Note that these are old messages, the most recent system status can be found on the main page. Fri Nov 26 6:15 EST 2010: The data center had some heating issue at around 1:30am Fri Nov 26 and system are down right now. We are investigating the cause of the problem and trying to fix the issue so that we could bring the system back up and running.

Fri Nov 5 15:26 EDT 2010: Scratch is down; we are working on it.

Tue Oct 26 10:32:22 EDT 2010: scratch is hung, we're investigating.

Fri Sep 24 23:02:41 EDT 2010: Systems are back up. Please report any problems to <support at scinet dot utoronto dot ca>

Fri Sep 24 19:55:02 EDT 2010: Chiller restarted. Systems should be back online this evening - likely 9-10PM. Widespread power glitch seems to have confused one of the Variable Speed Drives and control system was unable to restart it automatically.

Fri Sep 24 16:44:39 EDT 2010: The systems were unexpectedly and automatically shut down, due to a failure of the chiller. We are investigating and will bring the systems back up as soon as possible.

Mon Sep 13 15:39:05 EDT 2010: The systems were unexpectedly shut down, automatically, due to a failure of the chiller. We are investigating and will bring the systems back up as soon as possible.

Fri Sep 10 15:08:53 EDT 2010: GPFS upgraded to 3.3.0.6; new scratch quotas applied; check yours with /scinet/gpc/bin/diskUsage.

Mon Aug 16 13:12:08 EDT 2010: Filesystems are accessible and login is normal.

Mon Aug 16 12:38:45 EDT 2010: Login access to SciNet is failing due to /home filesystem problems. We are actively working on a solution.

Mon Aug 9 16:10:03 EDT 2010: The file system is slow and scheduler cannot be reached. We are working on the problem.

Sat Aug 7 11:42:50 EDT 2010: System status: Normal.

Sat Aug 7 10:45:52 EDT 2010: Problems with scheduler not responding on GPC. Should be fixed within an hour

Wed Aug 4 14:03:59 EDT 2010: GPC scheduler currently having filesystem issue which means you may have difficulty submitting jobs and monitoring the queue. We are working on the issue now.

Fri Jul 23 17:50:00 EDT 2010: Systems are up again after testing and maintenance.

Fri Jul 23 16:14:02 EDT 2010: Systems are down for testing and maintenance. Expect to be back up about 9PM this evening.

Tue Jul 20 17:40:45 EDT 2010: Systems are back. You may resubmit jobs.

Tue Jul 20 15:39:37 EDT 2010: Most GPC jobs died as well. The ones that appear to be running are likely in an unknown state, so will be killed in order to ensure they do not produce bogus results. The machines should be up shortly. Thanks for your patience and understanding. The SciNet team.

Tue Jul 20 15:19:30 EDT 2010: File systems down. We are looking at the problem now. All jobs on TCS have died. We'll inform you when the machine is available for use again. Please log out from the TCS.

Tue Jul 20 14:26:10 EDT 2010: Scratch file system down. We are looking at the problem now.

Fri Jul 16 10:59:27 EDT 2010: Systems normal.

Sun Jul 11 13:08:02 EDT 2010: All jobs running at ~3AM this morning almost certainly failed and/or were killed about 11AM. A hardware failure has reduced /home and /scratch performance by a factor of 2 but should be corrected tomorrow.

Sun Jul 11 09:56:27 EDT 2010: /scratch was inaccessible as of about 3AM

Fri Jul 9 16:18:18 EDT 2010: /scratch is accessible again

Fri Jul 9 15:38:43 EDT 2010: New trouble with the filesystems. We are working to fix things

Fri Jul 9 11:28:00 EDT 2010: The /scratch filesystem died at about 3 AM on Fri Jul 9, and all jobs running at the time died.