Unscheduled network outage on Brown, Rice, Snyder and Hammer
July 18, 2018 2:30am – 12:15pm
Brown, Hammer, Rice, Snyder
As of 12:15 pm Wednesday, 18 July 2018, Brown has returned to service and queued jobs are starting. As with the other clusters affected by this outage, jobs that were interrupted may need to be resubmitted.
This concludes the unscheduled outage of the Brown, Rice, Snyder, and Hammer clusters.
UPDATE: July 18, 2018 10:48am
As of 10:45 am, the Snyder and Hammer clusters are also back online and scheduling queued jobs. Jobs that were already running when the outage started may need to be resubmitted.
UPDATE: July 18, 2018 10:37am
As of 10:35 am, the Rice cluster has been returned to service and queued jobs are being started. Jobs that were running when the outage started may need to be resubmitted.
ORIGINAL: July 18, 2018 3:27am
The Brown, Hammer, Rice, and Snyder clusters began experiencing issues with scratch filesystems and network connectivity around 2:30am. Engineers are currently diagnosing the issue and are working to identify a fix. Job scheduling has been paused while this issue is being addressed.
We will provide an update by 10 am.