Whole-Floor Cluster Maintenance

UPDATE: May 13, 2021  2:17pm

As of 2:17pm, remaining issues with the Scholar cluster have been resolved, and Scholar has been returned to normal service. Job queues have been enabled and job scheduling has been resumed. In addition, the engineers were able to perform everything that has been planned for tomorrow's Scholar cluster maintenance, so that maintenance is now cancelled.

This concludes the whole-floor downtime due to Data Depot migration work. Thank you for your patience. Please report any issues to rcac-help@purdue.edu.

UPDATE: May 12, 2021  9:26pm

As of 9:26pm, the Bell, Brown, Gilbreth, Halstead, Hammer, WCERES, Workbench, and WSC Hadoop clusters have been returned to normal service following completion of Data Depot migration work. Job queues have been enabled and job scheduling has been resumed.

The Scholar cluster continues to be unavailable due to a problem with its Slurm scheduler database. Engineers continue working on Scholar and we will provide an update by the 5pm tomorrow.

Thank you for your patience, and please report any issues to rcac-help@purdue.edu.

ORIGINAL: May 11, 2021 5:00pm - May 12, 2021 11:00pm EDT

The majority of Research Computing computational resources (Bell, Gilbreth, Scholar, Brown, Workbench, Halstead, Hammer, WCERES, and WSC Hadoop clusters) will be unavailable May 11, 2021 5:00pm - May 12, 2021 11:00pm EDT for Data Depot migration work. The clusters will return to full production by Wednesday, May 12th, 2021 at 11:00pm.

Any SLURM jobs which request a walltime which would take them past Tuesday, May 11th, 2021 at 5:00pm will not start and will remain in the queue until after the maintenance is completed.

Originally posted: April 20, 2021 10:49am EDT