Halstead Scheduling Outage

May 25, 2017  10:30am – June 2, 2017  10:30am
Halstead

Nodes have continued to gradually reboot into the new image as jobs complete. At this point, more than 80% of Halstead has completed this process, and we have not seen any issues in them doing so. This outage is closed.

Update: May 25, 2017 5:00pm

About 25% of the nodes are back online and accepting new jobs. Most of the remaining nodes are still actively running jobs from before the issue arose, but engineers are watching to ensure as those jobs complete, those nodes properly update and start accepting new jobs as well.

Original Message:

Thursday, May 25th, 2017, around 10:30am, Halstead began to slowly stop scheduling new jobs due to an issue with a routine update process. The updates failed to apply as they would normally and is gradually making all Halstead nodes unavailable as their jobs complete. Currently running jobs are unaffected.

Engineers are investigating the issue now and hope to correct the bahvior quickly. However, there is not yet an estimate on when Halstead will be fully operational again. We will update this as we learn more.

Originally posted: May 25, 2017  3:51pm