Brown Transition to SLURM

March 3, 2020  8:00am – 4:30pm
Brown

UPDATE: March 3, 2020  4:30pm

As of 4:30pm, engineers have completed maintenance and have returned the Brown cluster back to normal service. All queues have been enabled and jobs have resumed scheduling. Please report any issues to rcac-help@purdue.edu


ORIGINAL: February 10, 2020  2:25pm

Brown will be transitioning to a new batch scheduler on Tuesday, March 3rd, 2020! This is a necessary upgrade and will require faculty and students to modify how they interact with the batch system. Please review the information below prior to this transition to ensure your work is not interrupted.

Brown will be switching from the PBS-based Torque/Moab scheduler to the newer SLURM scheduler. SLURM will offer additional features, reduce operating costs in the long-run, and is the leading scheduler amongst peer institutions.

The Brown User Guide has been updated with information about SLURM. The differences and how to convert from PBS to SLURM are also available in our SLURM Quick Reference Guide.

Don’t wait though! You can test your scripts and try out SLURM today! You can already log in to a testing environment for you to explore SLURM and the impact it will have on your work. Access this testing cluster, by ThinLinc at https://desktop.mack.rcac.purdue.edu or by SSH to mack.rcac.purdue.edu. Please be aware this environment is small, so to ensure everyone can test their scripts and try out SLURM, do not try to run any serious work on the testing cluster.

We will also be hosting in-person SLURM transition training/help sessions every Friday during the transition from 2:00-3:30pm in the Envision Center, and are available the other four days a week from 2:00-3:00pm at various locations around campus for drop-ins at our Coffee Hour Consultations.

In order to transition, the Brown cluster will be down for a longer maintenance than normal, beginning on Tuesday, March 3rd, 2020 at 8:00am. The cluster will not return to full production until Tuesday, March 3rd, 2020 at 4:30pm. Any PBS jobs which request a walltime which would take them past Tuesday morning will not begin, and due to the batch scheduler change, any jobs in the queue which have not run before the maintenance must be deleted as part of the change.

Originally posted: February 10, 2020  2:25pm