Article #3465: Unscheduled Data Depot and community clusters outage
At about 9:30am EDT, Data Depot servers started experiencing a ramping high load. Coupled with an ongoing scaling issues with the metadata subsystem,...
At about 9:30am EDT, Data Depot servers started experiencing a ramping high load. Coupled with an ongoing scaling issues with the metadata subsystem,...
The Bell, Brown, Gilbreth, Halstead, Scholar, and Workbench clusters began experiencing issues with mounting old Data Depot filesystem around 12:30am...
The Brown and Hammer cluster began experiencing issues with cooling in the POD data center around 5:40pm EDT. Engineers are currently diagnosing the i...
The Brown, Hammer, and Weber clusters began experiencing issues with cooling in the POD data center around 11:00am EDT. Engineers are currently diagno...
The Brown cluster began experiencing issues with cooling around 9:00pm EDT. Engineers are currently diagnosing the issue and are working to identify a...
At about 4:00 pm today (Wednesday, 21 July, 2021) System Engineers found an issue with the schedulers on the Bell, Brown, Gilbreth, Halstead, and Scho...
The Gilbreth cluster began experiencing issues with its scratch file system around 5:00pm EDT on Thursday, July 1st, 2021. Engineers are currently dia...
The Bell cluster began experiencing issues with its home and scratch directories filesystem around 12:40pm EDT. Problems manifest as hanging new login...
As of Thursday, June 17th, 2021 at 11:00am EDT, users of community clusters may experience intermittent "permission denied" errors while try...
The Fortress tape archive began experiencing load-induced issues around 1:00pm EDT. Problems manifest as various errors and timeouts while trying to a...
Due to problems with cooling system in the MATH datacenter, the CMS, Bell, Brown, Gilbreth, Halstead, WCERES, and WSC Hadoop clusters began experienci...
We have received multiple reports about ANSYS Fluent software on Bell cluster being unavailable. We are currently diagnosing the issue and are working...
The Workbench cluster began experiencing issues with its network uplink around 6:30pm EST. Engineers are currently diagnosing the issue and are worki...
We have received multiple user reports that Gilbreth cluster began experiencing issues with job submissions over the weekend. The problem manifests as...
The Halstead cluster began experiencing issues with its scratch filesystem mount around 4:30pm EST. Users may see "Stale file handle" messag...
The Data Depot storage server began experiencing issues around 3:00pm EST on Thursday, February 4th, 2021. Engineers are currently diagnosing the issu...
A large number of Scholar accounts have been accidentally removed during overnight processing. This manifests as "LDAP authorization check failed...
The Bell, Brown, Gilbreth, Halstead, Rice, Scholar, and Snyder clusters began experiencing issues with their Data Depot mounts around 10:00pm EST. Eng...
The Bell cluster began experiencing issues with its scratch filesystem around 4:00pm EST. Engineers are currently diagnosing the issue and have opened...
The Bell cluster began experiencing issues with its scratch filesystem around 5:00am EST. Engineers are currently diagnosing the issue and have opened...