Gilbreth Queue Changes

November 30, 2022 8:00am - December 1, 2022 5:00pm EST
Announcements
Gilbreth

To support the expansion of Gilbreth and related pricing changes, there will be several changes to queues on Gilbreth. These changes are designed to increase the availability of GPUs on Gilbreth and reduce wait time:

Each lab/PI will have a named queue with at least one GPU. This is typically named after the PI or the group.
- If you previously had access to a named queue for your lab/PI, the quantity of GPUs in that queue will remain the same.
- If you previously had subscription access to the shared partner queue only, you will now have access to a new, named queue containing one A10 GPU.
The partner, long, and highmem shared queues will be removed. Instead, all researchers may choose to submit jobs to their named lab/PI queue or to the new shared queue, standby. The standby queue will function the same as standby on other community clusters, and offer access to GPU nodes beyond those in the purchased named queues for the group. This replaces the functionality of the shared queues being retired.
Default output mode of slist command on Gilbreth has changed to display available GPU counts in your queues (as opposed to earlier CPU counts). This will make monitoring your queues easier.

To learn which queues you can submit jobs to, use the slist command in a terminal. You may need to edit your job scripts to change the queue name. The following table outlines the use cases for each of the new queues and suggests which of them you should use based on your current workflow.

Recommended queues based on your needs:
Old Queue	Old Limits	Use Case	New Queue	New Limits
`"mylab"`	Walltime: 2 Weeks Concurrent GPUs: Amount Purchased	Run jobs for up to 14 days; quick job start	`"mylab"`	Walltime: 2 Weeks Concurrent GPUs: Amount Purchased
`partner`	Walltime: 1 Day Concurrent Jobs: 6	Get shared access to a node for up to 1 day	`"mylab"`	Walltime: 2 Weeks Concurrent GPUs: Amount Purchased
`long`	Walltime: 1 Week Concurrent Jobs: 2 Concurrent GPUs: 12	Run a job for up to 7 days	`"mylab"`	Walltime: 2 Weeks Concurrent GPUs: Amount Purchased
`highmem`	Walltime: 6 Hours Concurrent Jobs: 1	Run jobs on a GPU with 32GB memory	`standby`	Walltime: 4 Hours Concurrent Jobs: 16 Concurrent GPUs: 16
`debug`	Walltime: 1 Hour Concurrent Jobs: 1	Run short tests to ensure jobs initialize correctly	`debug`	Walltime: 30 min Concurrent Jobs: 1 Concurrent GPUs: 1
`training`	Walltime: 1 Week Concurrent Jobs: 1 Concurrent GPUs: 12	Run large AI/ML model training on multiple GPUs for up to 7 days	`training`	Walltime: 4 Days Concurrent Jobs: 2 Concurrent GPUs: 8
		Run on idle nodes outside of the limits of your named queue	`standby`	Walltime: 4 Hours Concurrent Jobs: 16 Concurrent GPUs: 16

Please reach out to rcac-help@purdue.edu if you have any questions about the changes to queues.

Originally posted: November 8, 2022 12:25pm EST

Gilbreth Queue Changes

Follow Us