Unscheduled Halstead outage

  • February 6, 2021 4:30pm - 9:30pm EST
  • Outages and Maintenance
  • Halstead

UPDATE: February 6, 2021  9:23pm

Engineers identified the failed component that has caused filesystem confusion and restored scratch mounts on Halstead to their proper state.

As of 9:23pm, the Halstead cluster has been returned to normal service. Job queues have been enabled and job scheduling has been resumed. We apologize for the disruption of service. Please report any issues to rcac-help@purdue.edu.

ORIGINAL: February 6, 2021 4:30pm - 9:30pm EST

The Halstead cluster began experiencing issues with its scratch filesystem mount around 4:30pm. Users may see "Stale file handle" messages or be unable to navigate to their scratch directories. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this issue is being addressed.

We will provide an update by 9pm tonight.

Originally posted: February 6, 2021 5:30pm EST