Submitting a Job

Once you have a job submission file, you may submit this script to SLURM using the sbatch command. SLURM will find, or wait for, available resources matching your request and run your job there.

On Gautschi, in order to submit jobs, you need to specify the partition, account and Quality of Service (QoS) name to which you want to submit your jobs. To familiarize yourself with the partitions and QoS available on Gautschi, visit Gautschi Queues and Partitons. To check the available partitions on Gautschi, you can use the showpartitions , and to check your available accounts you can use slist commands. Slurm uses the term "Account" with the option -A or --account= to specify different batch accounts, the option -p or --partition= to select a specific partition for job submission, and the option -q or --qos= .

showpartitions

Partition statistics for cluster gautschi at Fri Dec  6 05:26:30 PM EST 2024
     Partition     #Nodes     #CPU_cores  Cores_pending   Job_Nodes MaxJobTime Cores Mem/Node
     Name State Total  Idle  Total   Idle Resorc  Other   Min   Max  Day-hr:mn /node     (GB)
       ai    up    20    20   2240   2240      0      0     1 infin   infinite   112    1031 
      cpu    up   336   300  64512  57600      0      0     1 infin   infinite   192     386 
  highmem    up     6     6   1152   1152      0      0     1 infin   infinite   192    1547   
 smallgpu    up     6     6    768    768      0      0     1 infin   infinite   128     386 
profiling    up     2     2    384    384      0      0     1 infin   infinite   192     386

Link to section 'CPU Partition' of 'Submitting a Job' CPU Partition

The CPU partition on Gautschi has two Quality of Service (QoS) levels: normal and standby. To submit your job to one compute node on cpu partition and 'normal' QoS which has "high priority":

$ sbatch --nodes=1 --ntasks=1 --partition=cpu --account=accountname --qos=normal myjobsubmissionfile
$ sbatch -N1 -n1 -p cpu -A accountname -q normal myjobsubmissionfile

To submit your job to one compute node on cpu partition and 'standby' QoS which is has "low priority":

$ sbatch --nodes=1 --ntasks=1 --partition=cpu --account=accountname --qos=standby myjobsubmissionfile
$ sbatch -N1 -n1 -p cpu -A accountname -q standby myjobsubmissionfile

Link to section ' AI Partition' of 'Submitting a Job' AI Partition

The CPU partition on Gautschi has two Quality of Service (QoS) levels: normal and preemptible. To submit your job to one compute node requesting one GPU on ai partition and 'normal' QoS which is has "high priority":

$ sbatch --nodes=1 --gpus-per-node=1 --ntasks=14 --partition=ai --account=accountname --qos=normal myjobsubmissionfile
$ sbatch -N1 --gpus-per-node=1 -n14 -p ai -A accountname -q normal myjobsubmissionfile

To submit your job to one compute node requesting one GPU on ai partition and 'preemptible' QoS which has "high priority":

$ sbatch --nodes=1 --gpus-per-node=1 --ntasks=14 --partition=ai --account=accountname --qos=preemptible myjobsubmissionfile
$ sbatch -N1 --gpus-per-node=1 -n14 -p ai -A accountname -q preemptible myjobsubmissionfile

Link to section 'Highmem Partition' of 'Submitting a Job' Highmem Partition

To submit your job to a compute node on the highmem partition, you don’t need to specify the QoS name because only one QoS exists for this partition, and the default is normal. However, the highmem partition is suitable for jobs with memory requirements that exceed the capacity of a standard node, so the number of requested tasks should be appropriately high.

Link to section 'Profiling Partition' of 'Submitting a Job' Profiling Partition

To submit your job to a compute node on the profiling partition, you also don’t need to specify the QoS name because only one QoS exists for this partition, and the default is normal.

$ sbatch --nodes=1 --ntasks=1 --partition=profiling --account=accountname myjobsubmissionfile
$ sbatch -N1 -n1 -p profiling -A accountname myjobsubmissionfile

Link to section 'Smallgpu Partition' of 'Submitting a Job' Smallgpu Partition

To submit your job to a compute node on the smallgpu partition, you don’t need to specify the QoS name because only one QoS exists for this partition, and the default is normal. You should request cores proportional to the number of GPUs you are using in this partition (i.e. if you only need one of the two GPUs, you should request half of the cores on the node).

$ sbatch --nodes=1 --ntasks=64 --gpus-per-node=1 --partition=smallgpu --account=accountname myjobsubmissionfile
$ sbatch -N1 -n64 --gpus-per-node=1 -p smallgpu -A accountname myjobsubmissionfile

Link to section 'General Information' of 'Submitting a Job' General Information

By default, each job receives 30 minutes of wall time, or clock time. If you know that your job will not need more than a certain amount of time to run, request less than the maximum wall time, as this may allow your job to run sooner. To request the 1 hour and 30 minutes of wall time:

$ sbatch -t 01:30:00 -N=1 -n=1 -p=cpu -A=accountname -q=standby myjobsubmissionfile

The --nodes= or -N value indicates how many compute nodes you would like for your job, and --ntasks= or -n value indicates the number of tasks you want to run.

In some cases, you may want to request multiple nodes. To utilize multiple nodes, you will need to have a program or code that is specifically programmed to use multiple nodes such as with MPI. Simply requesting more nodes will not make your work go faster. Your code must support this ability.

To request 2 compute nodes:

$ sbatch -t 01:30:00 -N=2 -n=16 -p=cpu -A=accountname -q=standby myjobsubmissionfile

By default, jobs on Gautschi will share nodes with other jobs.

If more convenient, you may also specify any command line options to sbatch from within your job submission file, using a special form of comment:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

#SBATCH --account=accountname
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --partition=cpu
#SBATCH --qos=normal
#SBATCH --time=1:30:00
#SBATCH --job-name myjobname

# Print the hostname of the compute node on which this job is running.
/bin/hostname

If an option is present in both your job submission file and on the command line, the option on the command line will take precedence.

After you submit your job with SBATCH, it may wait in queue for minutes, hours, or even weeks. How long it takes for a job to start depends on the specific queue, the resources and time requested, and other jobs already waiting in that queue requested as well. It is impossible to say for sure when any given job will start. For best results, request no more resources than your job requires.

Once your job is submitted, you can monitor the job status, wait for the job to complete, and check the job output.