Skip to main content
Have a request for an upcoming news/science story? Submit a Request

Carter

  • Cluster Maintenance - Carter

    • widget.news::news.updated:

    Carter has been return to normal operations. All queues have been enabled. Update: December 2, 2015 12:15pm Carter is mostly ready to return to service, but the site-wide home filesystem has suffered a failure which is preventing this from being co...

  • Cluster Maintenance - Carter

    • widget.news::news.updated:

    Carter has been returned to normal operation. Update: January 20, 2016 3:26pm: We are doing return to service testing now and expect Carter to return to production by 7:00pm. Update: January 20, 2016 12:00pm: Work is being wrapped up on Carter and...

  • Unscheduled outage on Carter

    • widget.news::news.updated:

    The underlying issues affecting Carter are resolved and job scheduling has been resumed. Many individual nodes remain offline for corrective action, and these will be returning to service gradually as engineers are able to fix them. In the interim,...

  • Unscheduled outage on Carter

    • widget.news::news.updated:

    The cause of this turned out to be a power loss to Carter's scratch filesystem and portions of the Data Depot, which has been restored now. Carter nodes are returning to normal operations now. Original Message: As of Thursday, February 4th, 2016 at...

  • Unscheduled Outage in Math Data Center

    • widget.news::news.updated:

    Most of the impact of this turned out to be to the Depot storage system, which has now been restored to normal operations. All the other affected systems are showing a return to normal operations now. Original Message: As of Thursday, February 4th,...

  • Unscheduled scratch outage on Carter

    • widget.news::news.updated:

    There was an issue with the cluster's gateway switches, causing infiniband traffic to be incapable of IP over infiniband. This also caused an instability in the lustre scratch servers, which required that they be rebooted. Jobs that were using scratc...

  • ECN services outage - ITaP Research Computing systems impacted

    Engineering Computing Network (ECN) will be performing staged patching and reboots of all of ECN's RedHat Linux workstations and servers to protect against a serious vulnerability in glibc system library. A significant number of ECN services will be...

  • Unscheduled Scratch Outage on Carter

    • widget.news::news.updated:

    The scratch storage on Carter and Scholar has been returned to normal operations. The rebuild process will be continuing in the background, so we will be watching for any degradation in the storage performance. All queues have been re-activated. T...

  • Unscheduled Scratch Outage on Carter

    • widget.news::news.updated:

    UPDATE: The issue with Carter's scratch filesystem has been resolved. The filesystem is now available. Job scheduling on the cluster has been resumed. The scratch filesystem serving Carter is currently unavailable. Job scheduling on Carter has bee...

  • New web-based quota monitoring tool

    A new web-based quota monitoring tool is available to all Research Cluster and Data Depot users. This tool is a web equivalent of the myquota tool on the clusters. The tool allows you to monitor your quota usage just like myquota, but it also allows...

  • Environment Modules System Upgrade

    On Monday, May 9, 2016 the environment module system on Carter, Conte, Hansen, and Hathi will be upgraded to Lmod, bringing all compute clusters up to the same environment modules system. This new system has been in use on the Rice and Snyder cluster...

  • Unscheduled Storage Outage

    • widget.news::news.updated:

    The Isilon filesystem was restored to normal service and all affected clusters had it remounted as quickly as was sustainable by the filesystem. This process was completed by Wednesday, May 18th, 2016 at 12:15am EDT. All clusters other than Conte (...

  • POD Cluster Maintenance

    • widget.news::news.updated:

    Carter and Scholar are back online for use as of 6:25am, though they will be operating with many nodes still offline. Staff will be working through Wednesday to steadily increase the number of nodes available. This concludes the POD cluster mainten...

  • ECN Services Outage

    Engineering Computing Network (ECN) will be performing scheduled maintenance this weekend on several ECN server resulting in their unavailability for a short time. Some ECN services will be affected, including several software license servers for ITa...

  • Degraded performance of several systems

    • widget.news::news.updated:

    We have seen a significant wave of these events this morning, September 21. For the most part, this wave seems to have been linked to a storage problem that has been resolved. However, we are implementing new monitoring and response procedures toda...

  • Unscheduled scratch outage on Carter

    • widget.news::news.updated:

    UPDATE: ITaP engineers have implemented a temporary solution so that work may continue on Carter until the scheduled upcoming maintenance window on Tuesday. Any jobs running which were using the scratch space have been stopped in order to allow for t...

  • Software stack changes and upgrades

    During the Home Filesystem Maintenance - All Clusters maintenance on September 27th, several upgrades and changes will be made to the software stack on the clusters. Changes will include updates to the default version of the Intel compiler and associ...

  • New Carter Scratch Filesystem

    • widget.news::news.updated:

    We are seeing some issues with the systems in the warp-scratch set of hosts. You may encounter an error with your home directory and/or a message about permissions upon login. Even if you see this, you may find the system is still able to access bo...

  • Home Filesystem Maintenance - All Clusters

    • widget.news::news.updated:

    Conte has been returned to normal operations as well now. This concludes the home directory maintenance on all systems. Update: September 27, 2016 11:55pm All systems other than Conte have been successfully returned to normal operations with the ne...

  • Carter Scratch Transfer Tool

    On September 27th, 2016, the Carter cluster scratch filesystem, which had been suffering from numerous issues, was replaced by an entirely new system. Unfortunately, in order to put the new system into place quickly, it was not possible to copy over...