GPU
The Gilbreth cluster nodes contain NVIDIA GPUs that support CUDA and OpenCL. See the detailed hardware overview for the specifics on the GPUs in Gilbreth.
This section illustrates how to use SLURM to submit a simple GPU program.
Suppose that you named your executable file gpu_hello from the sample code gpu_hello.cu
(see the section on compiling NVIDIA GPU codes). Prepare a job submission file with an appropriate name, here named gpu_hello.sub:
#!/bin/bash
# FILENAME: gpu_hello.sub
module load cuda
host=`hostname -s`
echo $CUDA_VISIBLE_DEVICES
# Run on the first available GPU
./gpu_hello 0
Submit the job:
sbatch -A ai --nodes=1 --gres=gpu:1 -t 00:01:00 gpu_hello.sub
Requesting a GPU from the scheduler is required.
You can specify total number of GPUs, or number of GPUs per node, or even number of GPUs per task:
sbatch-A ai --nodes=1 --gres=gpu:1 -t 00:01:00 gpu_hello.sub
sbatch-A ai --nodes=1 --gpus-per-node=1 -t 00:01:00 gpu_hello.sub
sbatch-A ai --nodes=1 --gpus-per-task=1 -t 00:01:00 gpu_hello.sub
After job completion, view the new output file in your directory:
ls -l
gpu_hello
gpu_hello.cu
gpu_hello.sub
slurm-myjobid.out
View results in the file for all standard output, slurm-myjobid.out
0
hello, world
If the job failed to run, then view error messages in the file slurm-myjobid.out.
To use multiple GPUs in your job, simply specify a larger value to the GPU specification parameter. However, be aware of the number of GPUs installed on the node(s) you may be requesting. The scheduler can not allocate more GPUs than physically exist. See detailed hardware overview and output of sfeatures command for the specifics on the GPUs in Gilbreth.