Outages and Maintenance
-
The Research Data Depot has been restored to service. A portion of the systems serving the Research Data Depot have suffered a failure. Some systems using Depot have been affected, particularly research clusters and users accessing the Depot over NFS...
-
Conte and Hathi Cluster Maintenance
The Conte and Hathi clusters have been updated and returned to full production. This is a gentle reminder that the Conte and Hathi clusters will be undergoing a scheduled maintenance beginning at Tuesday, February 21st, 2017 at 8:00am EST. Please sa...
-
Unscheduled scratch outage on Rice, Snyder, and Hammer
The scratch filesystem serving Hammer, Rice, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Hammer, Rice, and Snyder has been...
-
Halstead MPI problem, scheduling paused
Following the security updates on Halstead, an issue was discovered that prevented multi-node MPI jobs from running properly. Scheduling on Halstead has been stopped, and systems engineers are working on fixing the issue. We will provide further stat...
-
Emergency Security Patching of RCAC Clusters
Due to a recent security vulnerability, the Carter, Halstead, Hammer, Radon, Rice, Scholar, and Snyder clusters will have their operating system upgraded to a newer version during February 2, 2017 5:00pm - March 2, 2017 5:00pm EST. Unlike other cl...
-
Unscheduled scratch outage on Conte
The scratch filesystem serving Conte is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Conte has been paused while storage engineers addres...
-
Connectivity issues to Research Data Depot
System monitoring has revealed intermittent issues connecting to the Research Data Depot on Thursday January 19. When this issue occurs, users will experience pauses when working in a UNIX shell on community cluster systems, or as interrupted or drop...
-
Conte is back in production, and jobs have started running. Thank you for your patience. ===== Because of additional work required to fix a configuration problem, this maintenance is running past the scheduled end time. We are extending the outage...
-
Emergency maintenance for GitHub
Patching has been completed and github.rcac.purdue.edu service is back in full production mode. Original message Tonight, Thursday, January 12, 2017, at 9:00pm – 10:00pm EST github.rcac.purdue.edu will be taken down for brief emergency maintenance. G...
-
The maintenance work was completed successfully and Halstead has been returned to normal operations as of Wednesday, January 11, 2017 at 12:00pm. Original Message The Halstead cluster will be unavailable beginning at Wednesday, January 11th, 2017 at...
-
The maintenance for Carter cluster was cancelled and will be rescheduled at a later date. The cluster has remained in service. Original Notice The Carter cluster will be unavailable beginning at Tuesday, January 10th, 2017 at 8:00am EST, for emergen...
-
The Scholar cluster will be unavailable beginning at Thursday, January 5th, 2017 at 8:00am EST, for scheduled maintenance. The cluster will return to full production by Thursday, January 5th, 2017 at 5:00pm EST. This work is being done during the se...
-
The Halstead cluster will be unavailable beginning at Wednesday, January 4th, 2017 at 10:00am EST, for scheduled early-access maintenance (see Halstead Cluster Early Access Policies). The cluster will return to full production by Wednesday, January 4...
-
The Halstead cluster is back online as of 4:50 PM after scheduled early-access maintenance. Unfortunately, queued jobs were lost due to complications during maintenance. If you had any jobs queued and waiting before maintenance started, you will need...
-
Unscheduled Outage for EXRC Cluster
Following the restoration of power to the affected building, the EXRC cluster has been returned to service on Thursday, December 22nd, 2016 at 2:45pm EST. Original article As of Tuesday, December 20th, 2016 at 12:00pm EST, EXRC is unavailable due to...
-
UPDATE As of 7:50 pm, Wednesday, 14 December 2016, this issue is completely resolved. UPDATE As of about 6:00 pm another problem has been found in the EXRC scheduler code. We will update this news item once we have more details. Original Item The EXR...
-
The maintenance work was completed successfully and Halstead has been returned to normal operations as of Wednesday December 14, 2016 at 10:00am. Original message: The Halstead cluster will be unavailable beginning at Wednesday, December 14th, 2016 a...
-
The Halstead cluster will be unavailable beginning at Wednesday, December 7th, 2016 at 1:00pm EST, for scheduled early-access maintenance (see Halstead Cluster Early Access Policies). The cluster will return to full production by Wednesday, December...
-
Update: Engineers were able to isolate the problem and restart the necessary systems. The Data Depot should be available again. Halstead users should double check their running work. A portion of the systems serving the Research Data Depot have suffe...
-
Job scheduling paused on Radon
Job scheduling was paused on Radon between 6 pm and 7 pm this evening. Node monitoring processes marked most nodes offline around 6 pm, preventing new jobs from starting. System engineers cleared the fault in the node monitoring, and nodes came back...