Outages and Maintenance
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, March 2, 2022 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transfer...
-
Unscheduled Math data center cooling outage
The Math building data center began experience issues with its cooling system around 11:40am EST. As one of manifestations, users may experience issues while logging in to the Anvil, Bell, Gilbreth, Halstead, Workbench, and Data Depot clusters. To m...
-
Fortress Tape Archive Maintenance
The Fortress tape archive library will undergo replacement work on one of its tape-picking robotic arms on Thursday, February 24, 2022 from 8:30am - 5:00pm EST. During this time, Fortress will remain available and functional, but users may observe de...
-
As of 8:00pm EST on Friday, February 11th, 2022 the Data Depot filesystem outage has been resolved and scheduling has been resumed on all clusters. The Bell, Brown, Gilbreth, Halstead, Scholar, Workbench, and Data Depot cluster began experiencing i...
-
The Gilbreth cluster began experiencing issues with its Data Depot mounts around 9:00am EST. The /depot filesystem is not visible on some of the login and compute nodes. Engineers are currently diagnosing the issue and are working to identify a fix....
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, February 2, 2022 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
The Bell cluster began experiencing issues with scheduler database around 11:35am EST. The problem manifests as freezing and/or "socket timed out" and "Unable to contact slurm controller" error messages upon the usual Slurm comman...
-
The Weber cluster began experiencing issues with weber-sftp subsystem around 2:00pm EST. The problem affects ingress/egress path to the cluster. Engineers are currently diagnosing the issue and are working to identify a fix. We will provide an upda...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, January 5, 2022 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any transf...
-
The Scholar cluster will be unavailable January 4, 2022 8:00am - January 5, 2022 6:00pm EST for scheduled maintenance. The cluster will return to full production by Wednesday, January 5th, 2022 at 6:00pm EST. During this time, Scholar will have the...
-
The Bell cluster began experiencing issues with its scratch filesystem around 6:30pm EST. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this issue is being addressed. We will prov...
-
The Weber cluster began experiencing issues with expired VPN certificate around 10:00am EST. Engineers are currently diagnosing the issue and are working to identify a fix. We will provide an update by 5pm.
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, December 1, 2021 from 8:30am - 12:00pm EST for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
Fortress Archive Monthly Maintenance
The Fortress Archive will be unavailable Wednesday, November 3, 2021 from 8:30am - 12:00pm EDT for scheduled monthly maintenance (first Wednesday of every month). During this time, Fortress will receive normal software and hardware updates. Any trans...
-
The Bell cluster began experiencing issues with high load and sluggish performance on the scratch filesystem around 1:20pm EDT. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this...
-
The Weber cluster will be unavailable Wednesday, October 20, 2021 from 8:00am - 8:00pm EDT for scheduled maintenance. The cluster will return to full production by Wednesday, October 20th, 2021 at 8:00pm EDT. During this time, Weber will be expanded...
-
Unscheduled multiple clusters and Data Depot outage
The Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, Workbench clusters and Data Depot began experiencing issues with intermittent high load on the Data Depot servers around 4:30pm EDT. Engineers are currently diagnosing the issue and are working to...
-
Unscheduled Brown and Hammer outage
The Brown and Hammer clusters began experiencing issues with cooling due to problems at the Physical Facilities' chiller plant around 4:40pm EDT. To avoid overheating, job scheduling has been paused while this issue is being addressed. We will provid...
-
Unscheduled multiple clusters and Data Depot outage
The Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, Workbench clusters and Data Depot servers began experiencing issues with Data Depot mounting on Wednesday, September 29th, 2021 around 4:40pm EDT. Engineers are currently diagnosing the issue and...
-
Unscheduled Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, and Data Depot outage
The Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, and Data Depot cluster began experiencing issues with Data Depot mounting around 7:00am EDT. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been...