Article #1073: Unscheduled Scratch Outage on Rice, Snyder, Scholar
The scratch filesystem serving Rice, Scholar, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch...
The scratch filesystem serving Rice, Scholar, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch...
The scratch filesystem serving Rice, Scholar, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch...
Access to Data Depot from the Halstead, HalsteadGPU, Hathi, Rice, Scholar, and Snyder clusters has hung starting around Thursday, September 7th, 2017...
A failure has occurred in the systems which serve Data Depot to the various research clusters. Engineers are currently diagnosing the issue and are wo...
Engineers have restored failed core servers back to a functional state. Data Depot is up and running as normal and job scheduling resumed. Should you...
*** Update *** As of 7:00 pm, the problem on the scratch system has been corrected, and scheduling has resumed on all three affected clusters - Rice,...
As of 7:15pm, all queues on these clusters have resumed scheduling. Nodes will continue to be upgraded as they finish current jobs and become availab...
The Hammer, Rice, Scholar, and Snyder clusters have been returned to service. Please note that Thinlinc clients and web browser access can be found at...
The scratch filesystems serving Carter, Hammer, Rice, Scholar, and Snyder started behaving abnormally this morning. This may have affected some jobs,...
The scratch filesystem serving Hammer, Rice, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch...
Due to a recent security vulnerability, the Carter, Halstead, Hammer, Radon, Rice, Scholar, and Snyder clusters will have their operating system upgra...
Measures taken within the first two hours of this problem seem to have resolved the issue. Original Message: A portion of the systems serving the Rese...
Conte has been returned to normal operations as well now. This concludes the home directory maintenance on all systems. Update: September 27, 2016 1...
We have seen a significant wave of these events this morning, September 21. For the most part, this wave seems to have been linked to a storage probl...
As of 7:30 pm, all methods for connecting to Data Depot have been restored to working order. All connections with Samba (Network Drive mappings: datad...
Engineering Computing Network (ECN) will be performing scheduled maintenance this weekend on several ECN server resulting in their unavailability for...
Carter and Scholar are back online for use as of 6:25am, though they will be operating with many nodes still offline. Staff will be working through W...
The problem is now RESOLVED after the reboot of a router. ======= The network serving Snyder is currently experiencing issues. Attempts to log in to t...
The Isilon filesystem was restored to normal service and all affected clusters had it remounted as quickly as was sustainable by the filesystem. This...
Engineering Computing Network (ECN) will be performing staged patching and reboots of all of ECN's RedHat Linux workstations and servers to protect ag...