Oldwiki.scinet.utoronto.ca:System Alerts

From oldwiki.scinet.utoronto.ca
Jump to navigation Jump to search

System Status

upGPC upTCS upSandy upFile System
upGravity upP7 downViz upBGQ upHPSS


Mar 3, 1:00 PM: Limited operations

HPSS system out of space

Our apologies for this inconvenience. Our HPSS is about to run our of space (tapes and slots), and ingestion operations are suspended until further notice. We had planned expansion on order and slated to be operational in another month or so, however the rate of ingestion in the last 2 months was exceptionally high, and caught us somewhat of guard. In the meantime you may still recall or delete material.


Thu Feb 18 20:36:25 EST 2016 BGQ scratch is back online. Some jobs were killed.


Thu Feb 18, 8:13 PM: The scratch file system on the BGQ is having problems. We're investigating.


Jan 15, 11:20 AM: Systems are in the process of being brought online.

Jan 14, 3:30 PM: Downtime extended to noon on Friday Jan 15th (estimate).

Our sincere apologies for this extension of the downtime. Unfortunately, a problem has come to light with some of the disks in the file system. Because of the way the file system is set up, no data is lost, but if we put the system back into production now, a single additional failure would run the risks of data loss or corruption, so this needs to be fixed now.

The BGQ file system hasn't suffered from this and may be brought up earlier.

Updates will be posted here.

Note: Because of the downtime, we'll be deferring the scratch purging that was scheduled for January 15th to Wednesday January 20th.

Jan 13, 7:00 AM: Downtime in effect.

SCHEDULED MAINTENANCE DOWNTIME ANNOUNCEMENT

There will be a full SciNet shutdown from January 13th to January 14th, 2016 for scheduled annual maintenance.

All systems will go down at 7 AM on Wednesday January 13th; all login sessions and jobs will be killed at that time.