Halstead Scheduling Outage
May 25, 2017 10:30am – June 2, 2017 10:30am
Nodes have continued to gradually reboot into the new image as jobs complete. At this point, more than 80% of Halstead has completed this process, and we have not seen any issues in them doing so. This outage is closed.
Update: May 25, 2017 5:00pm
About 25% of the nodes are back online and accepting new jobs. Most of the remaining nodes are still actively running jobs from before the issue arose, but engineers are watching to ensure as those jobs complete, those nodes properly update and start accepting new jobs as well.
Thursday, May 25th, 2017, around 10:30am, Halstead began to slowly stop scheduling new jobs due to an issue with a routine update process. The updates failed to apply as they would normally and is gradually making all Halstead nodes unavailable as their jobs complete. Currently running jobs are unaffected.
Engineers are investigating the issue now and hope to correct the bahvior quickly. However, there is not yet an estimate on when Halstead will be fully operational again. We will update this as we learn more.