Outages and Maintenance
-
The Gilbreth cluster will be unavailable Thursday, February 25, 2021 at 8:00am EST for scheduled maintenance. The cluster will return to full production by %enddatetime%. During this time, Gilbreth will have the latest CUDA driver and toolkit softwar...
-
The Bell cluster will be unavailable Tuesday, February 23, 2021 at 8:00am EST for scheduled maintenance. The cluster will return to full production by %enddatetime%. During this time, Bell will have a maintenance upgrade performed on the software com...
-
Gilbreth queue submission problems
We have received multiple user reports that Gilbreth cluster began experiencing issues with job submissions over the weekend. The problem manifests as an "Invalid account or account/partition combination specified" error message from sbatch...
-
The Halstead cluster began experiencing issues with its scratch filesystem mount around 4:30pm EST. Users may see "Stale file handle" messages or be unable to navigate to their scratch directories. Engineers are currently diagnosing the iss...
-
The Data Depot storage server began experiencing issues around 3:00pm EST on Thursday, February 4th, 2021. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused on all clusters while this issue...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, February 3, 2021 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
The Rice cluster has reached the end of its life cycle and is being retired on Friday, January 15th, 2021. Researchers owning nodes on Rice should start archiving any data they may have there to the Fortress Archive now, or move it to other clusters...
-
The bulk of the Snyder cluster (A and B nodes of 2015 vintage) has reached the end of its life cycle and is being retired on Friday, January 15th, 2021. Researchers owning nodes in A or B sub-clusters should start archiving any data they may have th...
-
A large number of Scholar accounts have been accidentally removed during overnight processing. This manifests as "LDAP authorization check failed", or "Incorrect or Invalid username/password" and similar errors when trying to logi...
-
The Bell, Brown, Gilbreth, Halstead, Rice, Scholar, and Snyder clusters began experiencing issues with their Data Depot mounts around 10:00pm EST. Engineers are currently diagnosing the issue and are working to identify a fix. To avoid job losses for...
-
The Bell cluster began experiencing issues with its scratch filesystem around 4:00pm EST. Engineers are currently diagnosing the issue and have opened a ticket with the vendor to identify a fix. Job scheduling has been paused while this issue is bein...
-
The Bell cluster began experiencing issues with its scratch filesystem around 5:00am EST. Engineers are currently diagnosing the issue and have opened the ticket with the vendor to identify a fix. Job scheduling has been paused while this issue is be...
-
The Bell cluster began experiencing issues with metadata on its scratch filesystem around 9:00pm. The problem manifests itself as ls -l command hangs indefinitely, while the plain regular ls (or \ls, or stat FILE) appear to be working. Engineers are...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, January 6, 2021 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transf...
-
Access to RCAC Resources During ITaP Central Authentication Outage
On Sunday, December 27th, 2020, ITaP staff will perform major upgrades to the central authentication infrastructure. All applications that require logging in with BoilerKey or Career Account credentials will be unavailable Sunday, December 27, 2020 f...
-
The Bell cluster will be unavailable Wednesday, December 16, 2020 at 11:00am EST for scheduled maintenance. During this time, work will be performed on several auxiliary servers. Prior to the maintenance, any SLURM jobs which request a walltime which...
-
The Scholar cluster will be taken down for regular inter-semester maintenance and upgrades starting at Wednesday, December 16th, 2020 at 8:00am EST. All jobs which cannot complete before then will be held queued during this time, and no one will be a...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, December 2, 2020 from 8:00am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
The ITaP GitHub service (github.itap.purdue.edu) will be unavailable Tuesday, December 1, 2020 from 1:00pm - 4:00pm EST for scheduled maintenance. The service will return to full production by Tuesday, December 1st, 2020 at 4:00pm EST. During this ti...
-
Monthly RCAC GitHub Maintenance
The Research Computing GitHub service (github.rcac.purdue.edu) will be unavailable Tuesday, December 1, 2020 from 9:00am - 12:00pm EST for scheduled maintenance. The service will return to full production by Tuesday, December 1st, 2020 at 12:00pm EST...