Outages and Maintenance
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, May 4, 2022 from 8:30am - 12:00pm EDT for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transfers...
-
Unscheduled campus power outage
Several Research Computing resources became affected by a campus power outage around 7:00pm EDT. Multiple login and compute nodes may have powered down, leading to jobs fail and/or requeue with a NODE_FAIL or similar status. Engineers are currently d...
-
The Bell cluster began experiencing issues with its scratch filesystem around 9:00pm EDT on Saturday, April 9th, 2022. Access to files in scratch may appear severely delayed or frozen. Engineers are currently diagnosing the issue and are working to...
-
Unscheduled Weber ingress/egress outage
The Weber cluster's data transfer server (weber-sftp.rcac.purdue.edu) suffered a cooling fan failure around 8:30pm EDT on Saturday, April 9th, 2022. The cluster remains operation with the exception of ingress/egress of files via the affected server....
-
Gilbreth scratch degraded performance
Following last night's scratch outage, the Gilbreth scratch filesystem is currently functional but operates with partially degraded performance. Engineers have opened a support ticket with the vendor and monitor the state of the filesystem continuou...
-
Unscheduled Gilbreth cluster outage
The Gilbreth cluster began experiencing issues with its scratch filesystem around 7:00pm EDT. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this issue is being addressed. We will...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, April 6, 2022 from 8:00am - 2:00pm EDT for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transfers...
-
The Weber cluster will be unavailable Wednesday, March 30, 2022 from 8:00am - 6:00pm EDT for scheduled maintenance. The cluster will return to full production by Wednesday, March 30th, 2022 at 6:00pm EDT.
-
Unscheduled Data Depot Slowdown on Community Clusters
As of 9:00am EDT, users of community clusters may experience slowness while trying to access Data Depot (including loading modules, starting applications or reading data) . The symptoms appear on both login and compute nodes. System engineers are act...
-
Unscheduled Math Data Center Cooling Outage
The Math building data center began experience issues with its cooling system around 11:40am EDT. As one of manifestations, users may experience issues while logging in to the Anvil, Bell, Gilbreth, and Halstead clusters. To minimize thermal load on...
-
Whole-Floor Data Depot Maintenance
The Data Depot filesystem will be undergoing scheduled maintenance and will be unavailable for use from March 15, 2022 5:00pm - March 16, 2022 8:00am EDT for critical software updates which can only be applied during a full service downtime. During...
-
github.itap Offline for Scheduled Whole-Floor Maintenance
During RCAC whole-floor downtime due to scheduled Data Depot maintenance, ITaP’s Github Enterprise service will be unavailable. In an effort to reduce impact to developers, this work will be performed during off hours on Tuesday, March 15, 2022 from...
-
Whole-Floor Cluster Maintenance
The majority of Research Computing computational resources (Bell, Brown, Geddes, Gilbreth, Halstead, Hammer, Scholar, Weber, and Workbench clusters) will be unavailable March 15, 2022 4:00pm - March 16, 2022 12:00pm EDT during Whole-Floor Data Depo...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, March 2, 2022 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transfer...
-
Unscheduled Math data center cooling outage
The Math building data center began experience issues with its cooling system around 11:40am EST. As one of manifestations, users may experience issues while logging in to the Anvil, Bell, Gilbreth, Halstead, Workbench, and Data Depot clusters. To m...
-
Fortress Tape Archive Maintenance
The Fortress tape archive library will undergo replacement work on one of its tape-picking robotic arms on Thursday, February 24, 2022 from 8:30am - 5:00pm EST. During this time, Fortress will remain available and functional, but users may observe de...
-
As of 8:00pm EST on Friday, February 11th, 2022 the Data Depot filesystem outage has been resolved and scheduling has been resumed on all clusters. The Bell, Brown, Gilbreth, Halstead, Scholar, Workbench, and Data Depot cluster began experiencing i...
-
The Gilbreth cluster began experiencing issues with its Data Depot mounts around 9:00am EST. The /depot filesystem is not visible on some of the login and compute nodes. Engineers are currently diagnosing the issue and are working to identify a fix....
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, February 2, 2022 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
The Bell cluster began experiencing issues with scheduler database around 11:35am EST. The problem manifests as freezing and/or "socket timed out" and "Unable to contact slurm controller" error messages upon the usual Slurm comman...