Skip to main content
Have a request for an upcoming news/science story? Submit a Request

Outages

  • Scheduler Issue on Halstead

    • widget.news::news.updated:

    Halstead nodes continue to come back online. While the cluster is operating normally, the total amount of available nodes is not yet at full capacity. We will update on the situation by 6:00pm. Update: Scheduling has been restarted and jobs are cur...

  • Unscheduled Data Depot Outage

    The Data Depot file system was sporadically available for 2 hours today. Some jobs running on the Community Clusters paused during the instability but have resumed. We expect no job loss to have occurred. This issue is now resolved.

  • Unscheduled outage on Conte

    • widget.news::news.updated:

    As of 2:35 pm, Conte cluster is returned to service. Scheduling is resumed in all queues. Update The source of the problem has been identified and the fix is underway. We anticipate returning Conte to service by 3pm today. Original message The Conte...

  • Scratch system failure on Rice, Snyder, Hammer

    • widget.news::news.updated:

    *** Update *** As of 7:00 pm, the problem on the scratch system has been corrected, and scheduling has resumed on all three affected clusters - Rice, Snyder, and Hammer. Update Storage engineers are working with the system vendor to evaluate a propos...

  • Slowdown of Data Depot

    • widget.news::news.updated:

    As of 8:48pm the issue has been resolved. Original message The Research Data Depot is experiencing a system-wide slow down. Engineers have isolated the systems which are at the core of this phenomenon and are taking steps to restore normal service....

  • Data Depot Outage

    • widget.news::news.updated:

    Engineers have restored failed core servers back to a functional state. Data Depot is up and running as normal and job scheduling resumed. Should you encounter any lingering issues please let us know at rcac-help@purdue.edu Original Message Some core...

  • Halstead Scheduling Outage

    • widget.news::news.updated:

    Nodes have continued to gradually reboot into the new image as jobs complete. At this point, more than 80% of Halstead has completed this process, and we have not seen any issues in them doing so. This outage is closed. Update: May 25, 2017 5:00pm...

  • Email notifications from Research Computing website broken

    • widget.news::news.updated:

    Email notifications are up and running again as usual. Original Message As of 5pm Thursday evening, email notifications from the Research Computing website are not working. Some people are receiving no email and others are receiving damaged emails. T...

  • Email to "rcac-help@purdue.edu" not Working

    • widget.news::news.updated:

    As of 3:45pm Friday, the rcac-help@purdue.edu address is working normally again. Original Message Beginning 5:00pm Thursday, the rcac-help@purdue.edu email address stopped accepting email. Anything sent since then has not been received. We are workin...

  • Unscheduled outages on portions of clusters

    • widget.news::news.updated:

    Conte, Halstead, HalsteadGPU, and Hammer are back in full production. Job scheduling has been resumed on all clusters. Please let us know if you see any lingering issues at rcac-help@purdue.edu. UPDATE July 20, 2017 2:54pm Power has been restored to...

  • Unscheduled Depot Outage

    • widget.news::news.updated:

    A failure has occurred in the systems which serve Data Depot to the various research clusters. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused on all systems while this issue is being add...

  • Unscheduled Outage in Math Data Center

    • widget.news::news.updated:

    At approximately 2:00pm EDT on Tuesday, September 5th, 2017, the Math building data center lost some power feeds which supply the Conte, Halstead, HalsteadGPU, Hathi, and Radon clusters. Scheduling on these has been paused until we can be sure power...

  • Unscheduled Depot Outage

    • widget.news::news.updated:

    Access to Data Depot from the Halstead, HalsteadGPU, Hathi, Rice, Scholar, and Snyder clusters has hung starting around Thursday, September 7th, 2017 at 1:30pm EDT. Engineers are currently working to restore service to these systems. Job scheduling h...

  • Unscheduled scratch outage on Rice, Scholar and Snyder

    • widget.news::news.updated:

    The scratch filesystem serving Rice, Scholar, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Rice, Scholar, and Snyder has be...

  • Unscheduled Scratch Outage on Rice, Snyder, Scholar

    • widget.news::news.updated:

    The scratch filesystem serving Rice, Scholar, and Snyder is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Rice, Scholar, and Snyder has bee...

  • Unscheduled Hathi Outage

    Update as of 5:00 PM the cluster is back in production. The Hathi cluster began experiencing various issues stemming from a recent kernel upgrade around 7:00am EST. Engineers are currently diagnosing the issue and are working to identify a fix. We wi...

  • Unscheduled WSC Outage

    • widget.news::news.updated:

    The WSC Hadoop cluster began experiencing issues with login access around 10:30am EST. Engineers have identified the problem and are addressing it now. We expect to have service restored soon and will issue an update then.

  • Fortress Archive Unavailable

    • widget.news::news.updated:

    The Fortress archive is unavailable due to a datacenter power issue. Datacenter facilities staff are currently investigating, however, at this time there is no estimate for a return to service.

  • Unscheduled Depot Outage on Compute Clusters

    • widget.news::news.updated:

    The servers providing access to Data Depot from Brown, Conte, Halstead, HalsteadGPU, Radon, Rice, Scholar, and Snyder suffered a partial failure. Many nodes in these clusters temporarily lost access to Depot. Jobs accessing files on Depot may have pa...

  • Unscheduled scratch outage on Rice and Scholar clusters

    • widget.news::news.updated:

    The scratch filesystem serving Rice and Scholar is currently unavailable. Both currently running jobs and attempts to access files in scratch will block until the filesystem is back online. Job scheduling on Rice and Scholar has been paused while st...