POD Cluster Maintenance
Carter and Scholar are back online for use as of 6:25am, though they will be operating with many nodes still offline. Staff will be working through Wednesday to steadily increase the number of nodes available. This concludes the POD cluster maintenance.
Carter and Scholar are still being worked on. We will issue another update by 6:00am if not already in service.
The Rice, Hammer, and Peregrine1 clusters have been returned to normal operations as of 1:40am. Work continues on Carter and Scholar, and we will issue an update on those systems by 3:00am if not already in service.
The Snyder cluster has been returned to normal operations as of 12:00am. Work continues on the the other clusters listed here.
The work continues on these clusters, although progress was substantially delayed by the concurrent storage systems failure (Unscheduled Storage Outage). We will post an update by 2:00am or sooner as clusters return to service.
The Carter, Hammer, Peregrine1, Rice, Scholar, and Snyder clusters will be unavailable beginning at Tuesday, June 7th, 2016 at 5:30am EDT, for scheduled maintenance. The clusters will return to full production by Tuesday, June 7th, 2016 at 10:00pm.
During this time, maintenance will be performed on the cooling systems used by these clusters. This maintenance period will also allow critical high-availability fixes to be made to the Research Data Depot while client clusters are offline.
Any PBS jobs which request a walltime which would take them past Tuesday, June 7th, 2016 at 5:30am EDT will not start and will remain in the queue until after the maintenance is completed.