Skip to main content
Have a request for an upcoming news/science story? Submit a Request

RCAC Whole-Floor Downtime and Power Work

Link to update at August 2, 2021 12:24pm EDT UPDATE:

With the POD data center issue resolved, the Weber cluster has been returned back to normal service as of 12:24pm EDT. All queues have been enabled and jobs have resumed scheduling. Please report any issues to rcac-help@purdue.edu

This concludes the whole-floor downtime and maintenance. Thank you for your patience!

Link to update at August 2, 2021 11:59am EDT UPDATE:

Weber networking problem is resolved successfully and the cluster is ready to be returned to service. The RTS process is currently pending on the resolution of the sudden cooling issue in the POD data center.

We will provide another update by 6pm or sooner once the cooling problem is resolved.

Link to update at August 1, 2021 5:57pm EDT UPDATE:

Engineers continue troubleshooting Weber cluster networking issue that prevents it from returning to service. We will provide an update by noon tomorrow, August 2nd.

Link to update at August 1, 2021 2:35pm EDT UPDATE:

As of 2:35pm EDT, Geddes cluster has been returned back to normal service. Please report any issues to rcac-help@purdue.edu

Work continues on bringing Weber cluster back. We appreciate your patience and will provide an update by 6pm tonight.

Link to update at August 1, 2021 2:10pm EDT UPDATE:

As of 2:10pm EDT, Halstead cluster has been returned back to normal service. All queues have been enabled and jobs have resumed scheduling. Please report any issues to rcac-help@purdue.edu

Work continues on bringing Geddes and Weber clusters back. We appreciate your patience and will provide an update by 6pm tonight.

Link to update at August 1, 2021 11:55am EDT UPDATE:

As of 11:55am EDT, the required data center power work has been completed successfully.

Bell, Brown, CMS, Hammer, Gilbreth, Scholar and Workbench clusters have been returned back to normal service. All queues have been enabled and jobs have resumed scheduling. Please report any issues to rcac-help@purdue.edu

Work continues on bringing Halstead, Geddes and Weber clusters back. We appreciate your patience and will provide an update by 6pm tonight.

Link to update at July 28, 2021 4:09pm EDT UPDATE:

This is an update to remind you of the Maintenance downtime for most of the Research Computing resources starting this coming Friday, 30 July 2021. Please note in the attached schedule that some clusters (Brown, Hammer, and Weber) will be down on Friday while the others will not go down until Saturday.

The Data Depot will remain available for non-cluster access.

Link to original posting ORIGINAL:

The majority of the Research Computing computational resources will be unavailable July 30, 2021 7:00am - August 1, 2021 12:00pm EDT for a whole-floor downtime due to electrical power work in MATH and POD data centers. Along with a required preventative maintenance, the work will provide power equipment upgrades necessary to house the upcoming NSF-funded Anvil supercomputer.

Due to the nature and extent of the work, some resources will be affected longer than the others. The following table provides tentative maintenance start and end times for various Research Computing systems:
Start time End Time Resources
Friday, July 30th, 2021 at 7:00am Sunday, August 1st, 2021 at 12:00pm EDT Brown, Hammer, Weber
Saturday, July 31st, 2021 at 7:00am Sunday, August 1st, 2021 at 12:00pm EDT Bell, CMS, Halstead, Geddes, Gilbreth, Scholar, Workbench
Not affected Data Depot, WSC, WCERES, customer VMs and servers

All systems will return to full production by Sunday, August 1st, 2021 at 12:00pm EDT.

Any SLURM jobs which request a walltime which would take them past the above times will not start and will remain in the queue until after the maintenance is completed.

Originally posted: