Outages and Maintenance
-
Unscheduled outage for Samba/Windows
Service was restored around 7:30pm today. Engineers changed the way Samba authenticates users to avoid this problem going forward. -- Service was restored around 10:30am today, but has since failed again. Engineers are working on the problem, and we...
-
October 22, 2015 9:15pm All services have been restored and Hammer is now in production. October 22, 2015 7:00pm Engineers continue to work through issues relating to the move. Another update will be sent at 9pm. Original The Hammer cluster will be...
-
October 30, 2015 11:00am ITaP Engineers have made additional timeout changes to the scratch filesystem which has increased stability. Additional work is being scheduled for Tuesday, December 1, 2015 from 7:00am to 7:00pm. October 8, 2015 5:00pm An e...
-
Emergency scratch maintenance on Carter and Scholar
The scratch filesystem serving Carter/Scholar underwent emergency maintenance through Friday night and well into Saturday. We expect this work to resolve the periodic hangs this filesystem has been experiencing for the last two days. Job scheduling...
-
Cluster Maintenance - Hansen/Peregrine1
Update: September 22, 2015 1pm The work affecting Hansen and Peregrine1 scratch filesystems has been completed and the clusters are back in full production. Original The Hansen and Peregrine1 cluster will be unavailable beginning at Tuesday, Septembe...
-
Update: September 23, 2015 8am Shortly after 2am, Engineers were able to complete the file transfer and return Carter back to production. Update: September 22, 2015 11pm The file transfer continues and will last well into the night. The next update...
-
Unscheduled scratch outage on Rossmann
**Update: August 25, 2015 9:00 pm ** On Monday, August 24, a disk tray in the Rossmann scratch storage system suffered multiple failures and despite great effort by both ITaP storage engineers and the system vendor, this portion of the scratch system...
-
As of 11:55 pm August 18, 2015, Fortress/HPSS has been brought back online. Storage engineers continue working on bringing upgraded Fortress up and deploying new software to all RCAC systems. Current estimate for return to service: 12:00 am August 1...
-
Cluster Maintenance - Peregrine1
The Peregrine1 cluster will be unavailable beginning at August 17, 2015 8:00am - August 19, 2015 6:00pm EDT, for scheduled maintenance. The cluster will return to full production by Wednesday, August 19th, 2015 at 6:00pm EDT. During this time, Pere...
-
Unscheduled scratch outage on Rossmann
UPDATE As of 8pm on August 15, 2015 the scratch filesystem serving Rossmann is back in full production. Original message: The scratch filesystem serving Rossmann is currently unavailable. Both currently running jobs and attempts to access files in sc...
-
Due to power work in the MSEE building, most ECN services will be unavailable between 6:30am – 9:00pm EDT on Saturday, August 15, 2015. For Research Computing users this means that software packages licensed through ECN servers will not be able to ch...
-
The Hammer, Hathi, Radon, and Snyder cluster will be unavailable beginning at Wednesday, July 1, 2015 from 8:00am - 12:00pm EDT, for scheduled maintenance. The cluster will return to full production by Wednesday, July 1st, 2015 at 12:00pm EDT. The do...
-
Data Depot connectivity issues
ITaP engineers have identified issues causing intermittent failures on Carter. Engineers are currently tuning parameters on Depot system that have been identified as potential fixes to the issues. Access to Depot on Carter has been stable since tunin...
-
Fortress Service Unavailable June 23
The Fortress data archiving services will be unavailable starting 8:00AM on 23 June, 2015 due to a scheduled maintenance. During this outage, our storage engineers will: Upgrade hardware, and Configure RAID on the internal servers. Users are reques...
-
Rice job submission failing for some users
Update: The scheduling server has been rebooted and job submissions appear to be working normally again. Please let us know at rcac-help@purdue.edu if you see any further issues. Thanks again for your patience! Job submissions for at least some users...
-
Software upgrades on Rice Cluster have been completed by 7:30pm. It is now open for access by early adopters. Please let us know if you see any issue with the cluster. Maintenance on Snyder, Rossmann, Hansen, Hammer, and Conte has been completed and...
-
Due to power work in the MSEE building, most ECN services will be unavailable between 5:30 pm Thursday, 11 June, 2015 and 8:00 am Friday 12 June 2015. In particular, for Research Computing users this means that software packages licensed through ECN...
-
Fortress Samba service has been restored as of 10:15am on Monday, June 8th. We apologize for any inconvenience this has caused and thanks for your patience. Beginning Friday afternoon, the Fortress Samba mounts became unavailable due to an issue with...
-
UPDATE As of 4:45 pm Tuesday, May 19, all the work noted below has been completed and both Hansen and Peregrine-1 have been returned to full service. Thanks for your patience. =-=-= The Hansen and Peregrine-1 clusters will be unavailable beginning at...
-
UPDATE As of 9:00 pm Tuesday, 14 April, 2015, the Conte cluster is back in full production mode. During the maintenance, all nodes were checked for reliability and system software installations were checked for consistency between nodes, and issues...