[POSTPONED] Gilbreth Queue Changes

June 29, 2022 8:00am EDT
Announcements
Gilbreth

Link to update at July 8, 2022 2:12pm EDT UPDATE: July 8, 2022 2:12pm EDT

Due to delays with hardware delivery caused by global supply chain shortages, this planned change will not be carried out during the July 20 maintenance.

This conversion will be performed at a later date (TBD pending hardware delivery). Please continue using your current queues after the maintenance on July 20th.

Link to original posting ORIGINAL: June 29, 2022 8:00am - August 31, 2022 5:00pm EDT

To support the expansion of Gilbreth and related pricing changes, there will be several changes to queues on Gilbreth. These changes are designed to increase the availability of GPUs on Gilbreth and reduce wait time:

Each lab/PI will have a named queue with at least one GPU. This is typically named after the PI or the group.
- If you previously had access to a named queue for your lab/PI, the name of the queue and the quantity of GPUs will remain the same.
- If you previously had subscription access to the shared partner queue only, you will now have access to a new, named queue with one A10 or A30 GPU.
The partner, long, and highmem shared queues will be removed. Instead, all researchers may choose to submit jobs to their named lab/PI queue or to the new shared queue, standby. The standby queue will function the same as standby on other community clusters, and offer access to GPU nodes beyond those in the purchased named queues for the group. This replaces the functionality of the shared queues being retired.

To learn which queues you can submit jobs to, use the slist command in a terminal. You may need to edit your job scripts to change the queue name.

The following table explains which queues will be best based on your current need:
Old Queue	Old Limits	Use Case	New Queue	New Limits
`"mylab"`	Walltime: 2 Weeks Concurrent GPUs: Amount Purchased	Run jobs for up to 14 days; quick job start	`"mylab"`	Walltime: 2 Weeks Concurrent GPUs: Amount Purchased
`partner`	Walltime: 1 Day Concurrent Jobs: 6	Get shared access to a node for up to 1 day	`"mylab"`	Walltime: 2 Weeks Concurrent GPUs: Amount Purchased
`long`	Walltime: 1 Week Concurrent Jobs: 2 Concurrent GPUs: 12	Run a job for up to 7 days	`"mylab"`	Walltime: 2 Weeks Concurrent GPUs: Amount Purchased
`highmem`	Walltime: 6 Hours Concurrent Jobs: 1	Run jobs on a GPU with 32GB memory	`standby`	Walltime: 4 Hours Concurrent Jobs: 16 Concurrent GPUs: 16
`training`	Walltime: 1 Week Concurrent Jobs: 1 Concurrent GPUs: 12	Run large AI/ML model training on multiple GPUs for up to 7 days	`training`	Walltime: 4 Days Concurrent Jobs: 2 Concurrent GPUs: 8
		Run on idle nodes beyond your named queue	`standby`	Walltime: 4 Hours Concurrent Jobs: 16 Concurrent GPUs: 16

Please reach out to rcac-help@purdue.edu if you have any questions about the changes to queues.

Originally posted: June 28, 2022 4:36pm EDT