Outages and Maintenance
-
Unscheduled Scratch Outage on Carter
UPDATE As of about 6:30 pm, the new scratch system was brought back online, and scheduling has been restarted on Carter. Original Message The new scratch filesystem serving Carter that was just activated on Tuesday night is currently unavailable. Bot...
-
Firebox Virtual Server Maintenance
This work was largely postponed due to other concerns. A few related systems were relocated successfully during this time. Original Message: The hardware powering all Firebox Virtual Servers will be unavailable beginning at Tuesday, September 27th,...
-
Home Filesystem Maintenance - All Clusters
Conte has been returned to normal operations as well now. This concludes the home directory maintenance on all systems. Update: September 27, 2016 11:55pm All systems other than Conte have been successfully returned to normal operations with the ne...
-
The Conte cluster will be unavailable beginning at September 27, 2016 7:00am - September 28, 2016 11:59pm EDT, for Home Filesystem Maintenance - All Clusters. The cluster will return to partial production by midnight that day, but will remain at re...
-
Emergency maintenance for GitHub
This afternoon, at 5:00pm EDT github.rcac.purdue.edu was taken down for brief emergency maintenance. GitHub announced a critical security vulnerability this week. As a result, our GitHub instance was taken down for a few minutes of patching. Users of...
-
Unscheduled scratch outage on Carter
UPDATE: ITaP engineers have implemented a temporary solution so that work may continue on Carter until the scheduled upcoming maintenance window on Tuesday. Any jobs running which were using the scratch space have been stopped in order to allow for t...
-
Degraded performance of several systems
We have seen a significant wave of these events this morning, September 21. For the most part, this wave seems to have been linked to a storage problem that has been resolved. However, we are implementing new monitoring and response procedures toda...
-
The maintenance is complete and github.rcac is back in production, now at version 2.7.3. Apart from the usual patches and bugfixes, this is also a feature release. Release notes can be found at: https://enterprise.github.com/releases/2.7.3/notes Than...
-
Emergency Maintenance for EXRC Datacenter
Update The EXRC system has been temporarily relocated while plumbing repairs and mold abatement are being completed. The system has been returned to service and is fully operational. Once the repairs are complete and the system can be returned to it...
-
Update As of 5:50 pm, Tuesday, 16 Aug 2016, the Radon cluster has been returned to service and is fully operational. Thank you for your patience. Update Due to unanticipated conflicts between the upgraded scheduler and our network configuration, the...
-
Unscheduled Outage on Data Depot
UPDATE As of 5:30 pm. Friday, 5 August, 2016, we believe the problem affecting access to the Data Depot has been corrected. Thank you for your patience, and I apologize for the disruption this caused. Original Message Access to the Data Depot is curr...
-
The monthly GitHub maintenance is now complete and github.rcac is back up and running with the latest bug fixes and security patches. Thank you for your patience. Please let us know if you see any issues at rcac-help@purdue.edu. Original Message Once...
-
Self-service management web tool outage
As of 3:20 pm, the self-service tool is back in action. An issue with the database backing authentication was discovered and repaired. Original message The self-service management tool (user management) is experiencing issues with authentication. Att...
-
Unscheduled Outage on Data Depot
As of 7:30 pm, all methods for connecting to Data Depot have been restored to working order. All connections with Samba (Network Drive mappings: datadepot.rcac.purdue.edu, samba.rcac.purdue.edu) are working normally again. More Rice and Snyder nodes...
-
Engineering Computing Network (ECN) will be performing scheduled maintenance this weekend on several ECN server resulting in their unavailability for a short time. Some ECN services will be affected, including several software license servers for ITa...
-
The maintenance is complete and github.rcac is back in production, now at version 2.6.4. There are several bug fixes and security patches in this version, but no major feature updates. Additionally, an infrastructure change has been made to the way...
-
The maintenance is complete and github.rcac is back in production, now at version 2.6.4. There are several bug fixes and security patches in this version, but no major feature updates. Additionally, an infrastructure change has been made to the way...
-
The underlying storage has been fixed, and all these clusters have been returned to normal operations as of 10:00pm EDT. As of Tuesday, June 7th, 2016 at 4:10pm EDT, Conte, Hansen, Hathi, and Radon are unavailable due to a loss of Isilon home direct...
-
The Depot maintenance has been completed successfully. Depot is now back to normal operations. Other compute cluster outages running concurrent with this however, are still in progress. This maintenance window next week has been reduced as much as...
-
Carter and Scholar are back online for use as of 6:25am, though they will be operating with many nodes still offline. Staff will be working through Wednesday to steadily increase the number of nodes available. This concludes the POD cluster mainten...