During Anvil scheduled maintenance on March 21, 2023, several changes were made on Anvil.
The default Slurm partition has been changed from
shared. This change seeks to reduce accidental waste of compute resources and SUs. In the old default partition
wholenode, jobs consume all 128-cores on a node even if a user requests just one task, i.e. jobs get charged 128 SUs per node per hour. With the current change, jobs not requesting an explicit partition will be placed in the
sharedpartition instead, leading to fewer surprises.
sharedpartition does not allow multi-node jobs (see description of partitions and their limits). If your multi-node jobs used to rely on
wholenodebeing the default partition, you may have to specify the partition explicitly now. A
QOSMaxCpuPerJobLimiterror would be a good indicator during job submission.
CUDA updates. NVIDIA GPU driver has been updated, and CUDA 12.0.1 was added as a module.
Operating system updates. Operating system on Anvil machines has been updated to Rocky 8.7.
Slurm updates. We have updatef Slurm version to 22.05.8.
Please submit a ticket through ACCESS Help Desk at https://support.access-ci.org/open-a-ticket if you have any questions.