Outages and Maintenance
-
Updated, 3/29/2013 During the datacenter migration the core server for the Fortress HPSS system experienced a failure with its boot device. Vendor engineers are working on the issue, and as a result, the estimate to Fortress's return to production ha...
-
Update: ITaP engineers have corrected the issue affecting the LustreC filesystem. The system is back in production. Job scheduling on Carter, Hansen and Peregrine1 has been restarted. As always, thank you for your patience. If you encounter any issue...
-
During the March 12 to March 14 maintenance window, the WinHPC cluster will be unavailable due to upgrades to the electrical service in the MATH data center. WinHPC will be shut down at 8:00 am Tuesday, 12 March, 2013, and is expected to return to se...
-
On Tuesday, 12 March 2013, the samba.rcac.purdue.edu host will be offline for about 2 hours between 8:00 and noon for maintenance. This will not affect any running jobs or new job submission, but will mean that people who use Samba to map their home...
-
As of 9:00am, are seeing a problem with the LustreC scratch filesystem that serves Carter, Hansen, and Peregrine1. To prevent any more jobs from running into this, we have temporarily suspended scheduling of new jobs, though you may still submit to...
-
Update: As of about 11:00 am, the problem with the chilled water has been corrected, and scheduling has resumed on all RCAC clusters. Thank you for your patience. If you encounter any issues or have questions, please contact us at rcac-help@purdue.ed...
-
Campus chilled water serving the MATH data center is experiencing above-normal temperatures, and as a precaution, scheduling on the Coates, Rossmann, Hansen, Carter, and Radon clusters has been stopped. Steele is not affected. There should be no impa...
-
Unexpected Power Outage in MATH
Update: Noon, 1/8/13 The power issue in MATH has been resolved. Power has been restored to the nodes in the Coates-A subcluster affected by the outage. ITaP engineers have verified that the Coates-A subcluster is operating correctly, and have restart...
-
Scheduled Maintenance - RCAC home directory upgrades
Update - 7:00pm, 1/4/2013: - All community clusters (Steele, Coates, Rossmann, Hansen, Carter, and Peregrine1) are back in production. Radon is currently not in production, as ITaP engineers are addressing issues encountered during the upgrade. T...
-
Software Stack Changes during Scheduled Maintenance
During the New Years' weekend holiday, all ITaP HPC resources will be unavailable due to a scheduled upgrade of research home directories. While the systems are down they will also receive several updates to the software stack and modules. These upda...
-
Scheduling paused on ITaP research clusters
During scheduled network maintenance on network equipment connecting storage to ITaP clusters, all scheduling will be paused from 4-6pm. Running jobs will continue to execute, and new jobs may be submitted to PBS queues, but no new jobs will start u...
-
Scheduled Maintenance for Radon Cluster
UPDATE - 12:40 pm 27 Nov 2012: The update went smoothly, with no delays or problems, and Radon has returned to service as of 12:30 pm, 5 and a half hours earlier than expected. Please let us know at rcac-help@purdue.edu if you see any problems with...
-
Scheduled Maintenance - October 2012
UPDATE: 9 October, 2012 The Coates and Rossmann Clusters have both returned to production, and their maintenance is completed, as of 11:30 am, Tuesday 9 October, 2012 The Coates and Rossmann clusters will go down for scheduled maintenance at 8:00 am...
-
ADIC Scalar 10k tape library maintenance
From 8:00am-12pm on Friday, September 7, 2012, the ADIC Scalar 10K library serving the Fortress archive will be unavailable while emergency preventative maintenance is performed. Fortress will still be able to write files into HPSS, and files already...
-
Unscheduled Power outage in Math Datacenter
Update: 10:00pm Tuesday As of 8:30pm Tuesday 21 August 2012, the LustreB filesystem has been returned to full service. Our storage engineers with assistance of the vendor have verified that the system is stable. If you encounter any issues, please co...
-
Scheduled Maintenance - August 2012
In August 2012, some RCAC systems will be down for maintenance for up to three days in order to accommodate electrical service and chilled water upgrades in the Math building and OS and scheduler upgrades on the systems. Planned Maintenance Timelin...
-
Scheduled Maintenance - May 2012
In May 2012, all RCAC systems will each be down for maintenance for up to three days in order to accommodate electrical service work in the Math building and storage systems maintenance. Some systems will also be receiving OS and scheduler upgrades....
-
Community clusters, storage to be off line for upgrades and maintenance
Purdue’s Community Cluster Program supercomputers, related high-performance data storage and the Fortress archival data storage system will be down for scheduled maintenance for up to three days from May 15-17. For details, see the rcac-help@purdue.e...
-
Update - April 11, 2012 240pm At around 240pm, ITaP engineers have restored communications between the HPSS system and the tape library. Access to Fortress from Samba, HSI/HTAR and other methods has been restored. I apologize for the inconvenience th...
-
Update : 1:45pm As of As of 1:45pm this afternoon, systems staff have completed patching the samba servers used to access storage systems. You should now be able to connect to samba.rcac.purdue.edu for samba access to home and scratch directories and...