Article #1583: Unscheduled Data Depot outage on the clusters
The Brown, Gilbreth, Halstead, Hammer, Rice, Scholar, Snyder, and Workbench clusters began experiencing issues with connection to Data Depot filesyste...
The Brown, Gilbreth, Halstead, Hammer, Rice, Scholar, Snyder, and Workbench clusters began experiencing issues with connection to Data Depot filesyste...
Hammer, Scholar, Snyder, WCERES, WSC Hadoop, and Data Depot began experiencing issues with networking around 10:00am EST. Engineers are currently diag...
The Gilbreth cluster began experiencing issues with its scratch filesystem around 11:30am EST. Engineers are currently diagnosing the issue and are wo...
The Fortress Archive began experiencing issues with an internal database around 4:30pm. Engineers are currently working to remove the affected databas...
Data Depot suffered a system failure around 5:15pm EDT. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling...
The Rice cluster began experiencing issues with the scratch filesystem around 4:40pm EDT. Engineers are currently diagnosing the issue and are working...
The Brown and Hammer clusters experienced a partial power outage overnight which caused them to operate at a reduced capacity. Engineers are currentl...
All clusters began experiencing issues with the home file system around 12:30pm EDT. Engineers are currently diagnosing the issue and are working to i...
Starting at 2:45pm EDT, new jobs are not being scheduled on the Brown cluster in order to lighten the load. The current hot weather outside is overtax...
Work continues on bringing Hammer back to normal operation. Engineers have identified the source of the problem and are currently working to find a so...
Update: The issues with the Hammer cluster has been resolved and the cluster is back in production. This outage is closed. Original: The Hammer cluste...
Several clusters have experienced network connectivity and/or power issues around 10:00am EDT. Engineers are working on assessing and analyzing the si...
The Rice cluster is currently experiencing a vendor bug in its scratch filesystem. To prevent filesystem instability, job scheduling has been paused w...
Github is currently offline and is not responding. Engineers are currently working on bringing Github back up. We will provide another update later to...
At approximately, 8:30am EDT, the Brown, Hammer, Rice, and Snyder clusters became unavailable due to a campus power outage. While power has been resto...
The Fortress tape archive began experiencing issues with Globus and HSI access around 2:30pm EDT on Sunday, March 10th, 2019. Engineers are currently...
All clusters began experiencing issues with logins and a general slowdown around 3:10pm EST. This has been identified as being due to an issue on the...
Fortress began experiencing issues around 11:15am EST. Engineers are currently diagnosing the issue and are working to identify a fix. We will provide...
The Brown and BrownGPU cluster began experiencing issues with the scratch filesystem around 8:00am EST. Engineers are currently diagnosing the issue a...
Halstead, HalsteadGPU, Brown, and BrownGPU went offline during a campus power event around 8:40 am this morning. Engineers are working to bring the co...