Skip to main content
Have a request for an upcoming news/science story? Submit a Request

Outages

  • Unscheduled outages on portions of clusters

    • Last updated:

    Conte, Halstead, HalsteadGPU, and Hammer are back in full production. Job scheduling has been resumed on all clusters. Please let us know if you see any lingering issues at rcac-help@purdue.edu. UPDATE July 20, 2017 2:54pm Power has been restored to...

  • Email to "rcac-help@purdue.edu" not Working

    • Last updated:

    As of 3:45pm Friday, the rcac-help@purdue.edu address is working normally again. Original Message Beginning 5:00pm Thursday, the rcac-help@purdue.edu email address stopped accepting email. Anything sent since then has not been received. We are workin...

  • Email notifications from Research Computing website broken

    • Last updated:

    Email notifications are up and running again as usual. Original Message As of 5pm Thursday evening, email notifications from the Research Computing website are not working. Some people are receiving no email and others are receiving damaged emails. T...

  • Halstead Scheduling Outage

    • Last updated:

    Nodes have continued to gradually reboot into the new image as jobs complete. At this point, more than 80% of Halstead has completed this process, and we have not seen any issues in them doing so. This outage is closed. Update: May 25, 2017 5:00pm...

  • Data Depot Outage

    • Last updated:

    Engineers have restored failed core servers back to a functional state. Data Depot is up and running as normal and job scheduling resumed. Should you encounter any lingering issues please let us know at rcac-help@purdue.edu Original Message Some core...

  • Slowdown of Data Depot

    • Last updated:

    As of 8:48pm the issue has been resolved. Original message The Research Data Depot is experiencing a system-wide slow down. Engineers have isolated the systems which are at the core of this phenomenon and are taking steps to restore normal service....

  • Scratch system failure on Rice, Snyder, Hammer

    • Last updated:

    *** Update *** As of 7:00 pm, the problem on the scratch system has been corrected, and scheduling has resumed on all three affected clusters - Rice, Snyder, and Hammer. Update Storage engineers are working with the system vendor to evaluate a propos...

  • Unscheduled outage on Conte

    • Last updated:

    As of 2:35 pm, Conte cluster is returned to service. Scheduling is resumed in all queues. Update The source of the problem has been identified and the fix is underway. We anticipate returning Conte to service by 3pm today. Original message The Conte...

  • Unscheduled Data Depot Outage

    The Data Depot file system was sporadically available for 2 hours today. Some jobs running on the Community Clusters paused during the instability but have resumed. We expect no job loss to have occurred. This issue is now resolved.

  • Scheduler Issue on Halstead

    • Last updated:

    Halstead nodes continue to come back online. While the cluster is operating normally, the total amount of available nodes is not yet at full capacity. We will update on the situation by 6:00pm. Update: Scheduling has been restarted and jobs are cur...

  • Unscheduled Fortress outage

    • Last updated:

    The Fortress archival storage system is currently experiencing intermittent connectivity. We expect the situation to be resolved by approximately 1pm. UPDATE: Storage engineers have resolved the connectivity problems and Fortress is back in full prod...

  • Partial scratch outages on Rice, Snyder, Carter, Scholar and Hammer

    The scratch filesystems serving Carter, Hammer, Rice, Scholar, and Snyder started behaving abnormally this morning. This may have affected some jobs, and anyone using one of the login nodes for these clusters may have had sessions freeze or seen dela...

  • Unscheduled Data Depot Outage

    • Last updated:

    The Research Data Depot has been restored to service. A portion of the systems serving the Research Data Depot have suffered a failure. Some systems using Depot have been affected, particularly research clusters and users accessing the Depot over NFS...

  • Unscheduled scratch outage on Rice, Snyder, and Hammer

    • Last updated:

    The scratch filesystem serving Hammer, Rice, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Hammer, Rice, and Snyder has been...

  • Halstead MPI problem, scheduling paused

    Following the security updates on Halstead, an issue was discovered that prevented multi-node MPI jobs from running properly. Scheduling on Halstead has been stopped, and systems engineers are working on fixing the issue. We will provide further stat...

  • Unscheduled scratch outage on Conte

    • Last updated:

    The scratch filesystem serving Conte is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Conte has been paused while storage engineers addres...

  • Connectivity issues to Research Data Depot

    System monitoring has revealed intermittent issues connecting to the Research Data Depot on Thursday January 19. When this issue occurs, users will experience pauses when working in a UNIX shell on community cluster systems, or as interrupted or drop...

  • Unscheduled Outage for EXRC Cluster

    • Last updated:

    Following the restoration of power to the affected building, the EXRC cluster has been returned to service on Thursday, December 22nd, 2016 at 2:45pm EST. Original article As of Tuesday, December 20th, 2016 at 12:00pm EST, EXRC is unavailable due to...

  • EXRC Scheduling Issue

    • Last updated:

    UPDATE As of 7:50 pm, Wednesday, 14 December 2016, this issue is completely resolved. UPDATE As of about 6:00 pm another problem has been found in the EXRC scheduler code. We will update this news item once we have more details. Original Item The EXR...

  • Unscheduled Data Depot Outage

    • Last updated:

    Update: Engineers were able to isolate the problem and restart the necessary systems. The Data Depot should be available again. Halstead users should double check their running work. A portion of the systems serving the Research Data Depot have suffe...