Anvil Changes
During Anvil scheduled maintenance on March 21, 2023, several changes were made on Anvil.
-
The default Slurm partition has been changed from
wholenode
toshared
. This change seeks to reduce accidental waste of compute resources and SUs. In the old default partitionwholenode
, jobs consume all 128-cores on a node even if a user requests just one task, i.e. jobs get charged 128 SUs per node per hour. With the current change, jobs not requesting an explicit partition will be placed in theshared
partition instead, leading to fewer surprises.Note: the
shared
partition does not allow multi-node jobs (see description of partitions and their limits). If your multi-node jobs used to rely onwholenode
being the default partition, you may have to specify the partition explicitly now. AQOSMaxCpuPerJobLimit
error would be a good indicator during job submission. -
CUDA updates. NVIDIA GPU driver has been updated, and CUDA 12.0.1 was added as a module.
-
Operating system updates. Operating system on Anvil machines has been updated to Rocky 8.7.
-
Slurm updates. We have updatef Slurm version to 22.05.8.
Please submit a ticket through ACCESS Help Desk at https://support.access-ci.org/open-a-ticket if you have any questions.