Whole-Floor Cluster Maintenance

May 11, 2021  5:00pm – May 12, 2021  11:00pm
Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, WCERES, Workbench, WSC Hadoop

UPDATE: May 13, 2021  2:17pm

As of 2:17pm, remaining issues with the Scholar cluster have been resolved, and Scholar has been returned to normal service. Job queues have been enabled and job scheduling has been resumed. In addition, the engineers were able to perform everything that has been planned for tomorrow's Scholar cluster maintenance, so that maintenance is now cancelled.

This concludes the whole-floor downtime due to Data Depot migration work. Thank you for your patience. Please report any issues to rcac-help@purdue.edu.


UPDATE: May 12, 2021  9:26pm

As of 9:26pm, the Bell, Brown, Gilbreth, Halstead, Hammer, WCERES, Workbench, and WSC Hadoop clusters have been returned to normal service following completion of Data Depot migration work. Job queues have been enabled and job scheduling has been resumed.

The Scholar cluster continues to be unavailable due to a problem with its Slurm scheduler database. Engineers continue working on Scholar and we will provide an update by the 5pm tomorrow.

Thank you for your patience, and please report any issues to rcac-help@purdue.edu.


ORIGINAL: April 20, 2021  10:49am

The majority of Research Computing computational resources (Bell, Brown, Gilbreth, Halstead, Hammer, Scholar, WCERES, Workbench, and WSC Hadoop clusters) will be unavailable May 11, 2021 at 5:00pm – May 12, 2021 at 11:00pm for Data Depot migration work. The clusters will return to full production by Wednesday, May 12th, 2021 at 11:00pm.

Any SLURM jobs which request a walltime which would take them past Tuesday, May 11th, 2021 at 5:00pm will not start and will remain in the queue until after the maintenance is completed.

Originally posted: April 20, 2021  10:49am