Halstead and Brown unscheduled outage

February 11, 2019 8:40am - February 12, 2019 4:00pm EST
Outages
Brown, BrownGPU, Halstead, HalsteadGPU

Link to update at February 12, 2019 4:23pm EST UPDATE: February 12, 2019 4:23pm EST

As of 4:00 pm, the Halstead and HalsteadGPU scratch system and cluster has been returned to normal service. Job queues have been enabled and job scheduling has been resumed. Please report any issues to rcac-help@purdue.edu.

Link to update at February 12, 2019 10:51am EST UPDATE: February 12, 2019 10:51am EST

Brown and BrownGPU scratch has been returned to normal service. Job scheduling has been restarted, so Brown and BrownGPU are back to full production. Please let us know if you see any lingering issues at rcac-help@purdue.edu.

Storage engineers and the vendor continue to work on bringing Halstead/HalsteadGPU scratch back to service. We will provide another update on Halstead by 2 pm today.

Link to update at February 11, 2019 4:44pm EST UPDATE: February 11, 2019 4:44pm EST

Both Halstead and Brown scratch filesystems (shared by their respective GPU system too) suffered damage due to a power spike during the power outage earlier today. Storage engineers and engineers from the vendor are continuing to work on it into this evening.

Job scheduling remains paused. Scratch purges are also canceled this week for Brown and Halstead scratches.

We will provide another update by 10:00 am tomorrow morning.

Link to original posting ORIGINAL: February 11, 2019 8:40am EST

Halstead, HalsteadGPU, Brown, and BrownGPU went offline during a campus power event around 8:40 am this morning. Engineers are working to bring the compute nodes and the scratch system back online. Other systems are back online at this time. Job scheduling is paused at the moment.

We will provide an update by 5 pm this afternoon.

Originally posted: February 11, 2019 1:24pm EST
Last updated: February 12, 2019 4:23pm EST

Halstead and Brown unscheduled outage

Link to update at February 12, 2019 4:23pm EST UPDATE: February 12, 2019 4:23pm EST

Link to update at February 12, 2019 10:51am EST UPDATE: February 12, 2019 10:51am EST

Link to update at February 11, 2019 4:44pm EST UPDATE: February 11, 2019 4:44pm EST

Link to original posting ORIGINAL: February 11, 2019 8:40am EST

Follow Us