Article #851: Unscheduled Storage Outage
The Isilon filesystem was restored to normal service and all affected clusters had it remounted as quickly as was sustainable by the filesystem. This...
The Isilon filesystem was restored to normal service and all affected clusters had it remounted as quickly as was sustainable by the filesystem. This...
On Monday, May 9, 2016 the environment module system on Carter, Conte, Hansen, and Hathi will be upgraded to Lmod, bringing all compute clusters up to...
A new web-based quota monitoring tool is available to all Research Cluster and Data Depot users. This tool is a web equivalent of the myquota tool on...
UPDATE: The issue with Carter's scratch filesystem has been resolved. The filesystem is now available. Job scheduling on the cluster has been resum...
The scratch storage on Carter and Scholar has been returned to normal operations. The rebuild process will be continuing in the background, so we wil...
Engineering Computing Network (ECN) will be performing staged patching and reboots of all of ECN's RedHat Linux workstations and servers to protect ag...
There was an issue with the cluster's gateway switches, causing infiniband traffic to be incapable of IP over infiniband. This also caused an instabil...
The cause of this turned out to be a power loss to Carter's scratch filesystem and portions of the Data Depot, which has been restored now. Carter no...
Most of the impact of this turned out to be to the Depot storage system, which has now been restored to normal operations. All the other affected sys...
The underlying issues affecting Carter are resolved and job scheduling has been resumed. Many individual nodes remain offline for corrective action,...
Carter has been returned to normal operation. Update: January 20, 2016 3:26pm: We are doing return to service testing now and expect Carter to return...
Carter has been return to normal operations. All queues have been enabled. Update: December 2, 2015 12:15pm Carter is mostly ready to return to serv...
October 30, 2015 11:00am ITaP Engineers have made additional timeout changes to the scratch filesystem which has increased stability. Additional work...
The scratch filesystem serving Carter/Scholar underwent emergency maintenance through Friday night and well into Saturday. We expect this work to res...
Update: September 23, 2015 8am Shortly after 2am, Engineers were able to complete the file transfer and return Carter back to production. Update: Sept...
ITaP is pleased to announce several upgrades to the Carter cluster to better enable data-intensive science. Network To relieve potential bottlenecks o...
Due to power work in the MSEE building, most ECN services will be unavailable between 6:30am – 9:00pm EDT on Saturday, August 15, 2015. For Research C...
ITaP engineers have identified issues causing intermittent failures on Carter. Engineers are currently tuning parameters on Depot system that have bee...
As of 3:15 pm the maintenance is complete and Research Data Depot is returned to full production. Original message: The storage servers powering the R...
On the morning of Thursday, February 5, 2015, Carter, Conte, Hansen, Peregrine1, Radon, and Rossmann login servers will be rebooted to apply an impor...