Brown Transition to SLURM
March 3, 2020 8:00am – 4:30pm
As of 4:30pm, engineers have completed maintenance and have returned the Brown cluster back to normal service. All queues have been enabled and jobs have resumed scheduling. Please report any issues to firstname.lastname@example.org
ORIGINAL: February 10, 2020 2:25pm
Brown will be transitioning to a new batch scheduler on Tuesday, March 3rd, 2020! This is a necessary upgrade and will require faculty and students to modify how they interact with the batch system. Please review the information below prior to this transition to ensure your work is not interrupted.
Brown will be switching from the PBS-based Torque/Moab scheduler to the newer SLURM scheduler. SLURM will offer additional features, reduce operating costs in the long-run, and is the leading scheduler amongst peer institutions.
Don’t wait though! You can test your scripts and try out SLURM today! You can already log in to a testing environment for you to explore SLURM and the impact it will have on your work. Access this testing cluster, by ThinLinc at https://desktop.mack.rcac.purdue.edu or by SSH to mack.rcac.purdue.edu. Please be aware this environment is small, so to ensure everyone can test their scripts and try out SLURM, do not try to run any serious work on the testing cluster.
We will also be hosting in-person SLURM transition training/help sessions every Friday during the transition from 2:00-3:30pm in the Envision Center, and are available the other four days a week from 2:00-3:00pm at various locations around campus for drop-ins at our Coffee Hour Consultations.
In order to transition, the Brown cluster will be down for a longer maintenance than normal, beginning on Tuesday, March 3rd, 2020 at 8:00am. The cluster will not return to full production until Tuesday, March 3rd, 2020 at 4:30pm. Any PBS jobs which request a walltime which would take them past Tuesday morning will not begin, and due to the batch scheduler change, any jobs in the queue which have not run before the maintenance must be deleted as part of the change.