Article #5174: Unscheduled Bell outage
A section of the Bell cluster compute nodes began experiencing issues with power feed and cooling around 2:30pm EDT. Engineers have powered down affec...
A section of the Bell cluster compute nodes began experiencing issues with power feed and cooling around 2:30pm EDT. Engineers have powered down affec...
The Math building data center began experience issues with its cooling system around 1:40pm EDT. To minimize thermal load on the cooling infrastructu...
As announced during the recent Bell outage, some temporary austerity measures need to be implemented to prevent the scratch file system from filling u...
Bell Scratch is near capacity and performance is degraded. As of this morning, Bell Scratch was 94% full. This afternoon we paused scheduling as scrat...
Several Research Computing resources became affected by a campus power outage around 7:00pm EDT. Multiple login and compute nodes may have powered dow...
The Bell cluster began experiencing issues with its scratch filesystem around 9:00pm EDT on Saturday, April 9th, 2022. Access to files in scratch may...
As of 9:00am EDT, users of community clusters may experience slowness while trying to access Data Depot (including loading modules, starting applicati...
The Math building data center began experience issues with its cooling system around 11:40am EDT. As one of manifestations, users may experience issu...
The majority of Research Computing computational resources (Bell, Brown, Geddes, Gilbreth, Halstead, Hammer, Scholar, Weber, and Workbench clusters) w...
The Math building data center began experience issues with its cooling system around 11:40am EST. As one of manifestations, users may experience issu...
As of 8:00pm EST on Friday, February 11th, 2022 the Data Depot filesystem outage has been resolved and scheduling has been resumed on all clusters....
The Bell cluster began experiencing issues with scheduler database around 11:35am EST. The problem manifests as freezing and/or "socket timed out...
The Bell cluster began experiencing issues with its scratch filesystem around 6:30pm EST. Engineers are currently diagnosing the issue and are working...
Research Computing personnel will observe the university winter break from 5:00pm EST EST on Wednesday, December 22nd, 2021, and will resume normal bu...
The Bell cluster began experiencing issues with high load and sluggish performance on the scratch filesystem around 1:20pm EDT. Engineers are currentl...
The Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, Workbench clusters and Data Depot began experiencing issues with intermittent high load on the D...
The Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, Workbench clusters and Data Depot servers began experiencing issues with Data Depot mounting on...
The Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, and Data Depot cluster began experiencing issues with Data Depot mounting around 7:00am EDT. Eng...
At about 9:30am EDT, Data Depot servers started experiencing a ramping high load. Coupled with an ongoing scaling issues with the metadata subsystem,...
The Bell, Brown, Gilbreth, Halstead, Scholar, and Workbench clusters began experiencing issues with mounting old Data Depot filesystem around 12:30am...