Example Jobs
A number of example jobs are available for you to look over and adapt to your own needs. The first few are generic examples, and latter ones go into specifics for particular software packages.
Generic SLURM Jobs
The following examples demonstrate the basics of SLURM jobs, and are designed to cover common job request scenarios. These example jobs will need to be modified to run your application or code.
Simple Job
Every SLURM job consists of a job submission file. A job submission file contains a list of commands that run your program and a set of resource (nodes, walltime, queue) requests. The resource requests can appear in the job submission file or can be specified at submit-time as shown below.
This simple example submits the job submission file hello.sub to the cpu queue on Gautschi and requests a single node:
#!/bin/bash
# FILENAME: hello.sub
# Show this ran on a compute node by running the hostname command.
hostname
echo "Hello World"
sbatch -A myallocation -p queue-name --nodes=1 --ntasks=1 --cpus-per-task=1 --time=00:01:00 hello.sub
Submitted batch job 3521
For a real job you would replace echo "Hello World"
with a command, or sequence of commands, that run your program.
After your job finishes running, the ls command will show a new file in your directory, the .out file:
ls -l
hello.sub
slurm-3521.out
The file slurm-3521.out contains the output and errors your program would have written to the screen if you had typed its commands at a command prompt:
cat slurm-3521.out
gautschi-a001.rcac.purdue.edu
Hello World
You should see the hostname of the compute node your job was executed on. Following should be the "Hello World" statement.
Multiple Node
In some cases, you may want to request multiple nodes. To utilize multiple nodes, you will need to have a program or code that is specifically programmed to use multiple nodes such as with MPI. Simply requesting more nodes will not make your work go faster. Your code must support this ability.
This example shows a request for multiple compute nodes. The job submission file contains a single command to show the names of the compute nodes allocated:
# FILENAME: myjobsubmissionfile.sub
#!/bin/bash
echo "$SLURM_JOB_NODELIST"
sbatch --nodes=2 --ntasks=384 --time=00:10:00 -A myallocation -p queue-name myjobsubmissionfile.sub
Compute nodes allocated:
gautschi-a[014-015]
The above example will allocate the total of 384 CPU cores across 2 nodes. Note that if your multi-node job requests fewer than each node's full 192 cores per node, by default Slurm provides no guarantee with respect to how this total is distributed between assigned nodes (i.e. the cores may not necessarily be split evenly). If you need specific arrangements of your tasks and cores, you can use --cpus-per-task= and/or --ntasks-per-node= flags. See Slurm documentation or man sbatch for more options.
Directives
So far these examples have shown submitting jobs with the resource requests on the sbatch
command line such as:
sbatch -A myallocation -p queue-name --nodes=1 --time=00:01:00 hello.sub
The resource requests can also be put into job submission file itself. Documenting the resource requests in the job submission is desirable because the job can be easily reproduced later. Details left in your command history are quickly lost. Arguments are specified with the #SBATCH
syntax:
#!/bin/bash # FILENAME: hello.sub
#SBATCH -A myallocation -p queue-name #SBATCH --nodes=1 --time=00:01:00 # Show this ran on a compute node by running the hostname command. hostname echo "Hello World"
The #SBATCH
directives must appear at the top of your submission file. SLURM will stop parsing directives as soon as it encounters a line that does not start with '#'. If you insert a directive in the middle of your script, it will be ignored.
This job can be then submitted with:
sbatch hello.sub
Interactive Jobs
Interactive jobs are run on compute nodes, while giving you a shell to interact with. They give you the ability to type commands or use a graphical interface in the same way as if you were on a front-end login host.
To submit an interactive job, use sinteractive to run a login shell on allocated resources.
sinteractive accepts most of the same resource requests as sbatch, so to request a login shell on the cpu account while allocating 2 nodes and 192 total cores, you might do:
sinteractive -A myallocation -p cpu -N2 -n384
To quit your interactive job:
exit
or Ctrl-D
The above example will allocate the total of 384 CPU cores across 2 nodes. Note that if your multi-node job requests fewer than each node's full 192 cores per node, by default Slurm provides no guarantee with respect to how this total is distributed between assigned nodes (i.e. the cores may not necessarily be split evenly). If you need specific arrangements of your tasks and cores, you can use --cpus-per-task= and/or --ntasks-per-node= flags. See Slurm documentation or man salloc for more options.
Serial Jobs
This shows how to submit one of the serial programs compiled in the section Compiling Serial Programs.
Create a job submission file:
#!/bin/bash
# FILENAME: serial_hello.sub
./serial_hello
Submit the job:
sbatch --nodes=1 --ntasks=1 --time=00:01:00 serial_hello.sub
After the job completes, view results in the output file:
cat slurm-myjobid.out
Runhost:gautschi-a009.rcac.purdue.edu
hello, world
If the job failed to run, then view error messages in the file slurm-myjobid.out.
Link to section 'Collecting System Resource Utilization Data' of 'Monitoring Resources' Collecting System Resource Utilization Data
Knowing the precise resource utilization an application had during a job, such as CPU load or memory, can be incredibly useful. This is especially the case when the application isn't performing as expected.
One approach is to run a program like htop
during an interactive job and keep an eye on system resources. You can get precise time-series data from nodes associated with your job using XDmod as well, online. But these methods don't gather telemetry in an automated fashion, nor do they give you control over the resolution or format of the data.
As a matter of course, a robust implementation of some HPC workload would include resource utilization data as a diagnostic tool in the event of some failure.
The monitor
utility is a simple command line system resource monitoring tool for gathering such telemetry and is available as a module.
module load monitor
Complete documentation is available online at resource-monitor.readthedocs.io. A full manual page is also available for reference, man monitor
.
In the context of a SLURM job you will need to put this monitoring task in the background to allow the rest of your job script to proceed. Be sure to interrupt these tasks at the end of your job.
#!/bin/bash
# FILENAME: monitored_job.sh
module load monitor
# track per-code CPU load
monitor cpu percent --all-cores >cpu-percent.log &
CPU_PID=$!
# track memory usage
monitor cpu memory >cpu-memory.log &
MEM_PID=$!
# your code here
# shut down the resource monitors
kill -s INT $CPU_PID $MEM_PID
A particularly elegant solution would be to include such tools in your prologue script and have the tear down in your epilogue script.
For large distributed jobs spread across multiple nodes, mpiexec
can be used to gather telemetry from all nodes in the job. The hostname is included in each line of output so that data can be grouped as such. A concise way of constructing the needed list of hostnames in SLURM is to simply use srun hostname | sort -u
.
#!/bin/bash
# FILENAME: monitored_job.sh
module load monitor
# track all CPUs (one monitor per host)
mpiexec -machinefile <(srun hostname | sort -u) \
monitor cpu percent --all-cores >cpu-percent.log &
CPU_PID=$!
# track memory on all hosts (one monitor per host)
mpiexec -machinefile <(srun hostname | sort -u) \
monitor cpu memory >cpu-memory.log &
MEM_PID=$!
# your code here
# shut down the resource monitors
kill -s INT $CPU_PID $MEM_PID
To get resource data in a more readily computable format, the monitor
program can be told to output in CSV format with the --csv
flag.
monitor cpu memory --csv >cpu-memory.csv
For a distributed job you will need to suppress the header lines otherwise one will be created by each host.
monitor cpu memory --csv | head -1 >cpu-memory.csv
mpiexec -machinefile <(srun hostname | sort -u) \
monitor cpu memory --csv --no-header >>cpu-memory.csv
Specific Applications
The following examples demonstrate job submission files for some common real-world applications. See the Generic SLURM Examples section for more examples on job submissions that can be adapted for use.
Python
Notice: Python 2.7 has reached end-of-life on Jan 1, 2020 (announcement). Please update your codes and your job scripts to use Python 3.
Python is a high-level, general-purpose, interpreted, dynamic programming language. We suggest using Anaconda which is a Python distribution made for large-scale data processing, predictive analytics, and scientific computing. For example, to use the default Anaconda distribution:
$ module load conda
For a full list of available Anaconda and Python modules enter:
$ module spider conda
Example Python Jobs
This section illustrates how to submit a small Python job to a SLURM queue.
Link to section 'Example 1: Hello world' of 'Example Python Jobs' Example 1: Hello world
Prepare a Python input file with an appropriate filename, here named hello.py
:
# FILENAME: hello.py
import string, sys
print("Hello, world!")
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/bash
# FILENAME: myjob.sub
module load conda
python hello.py
Hello, world!
Link to section 'Example 2: Matrix multiply' of 'Example Python Jobs' Example 2: Matrix multiply
Save the following script as matrix.py:
# Matrix multiplication program
x = [[3,1,4],[1,5,9],[2,6,5]]
y = [[3,5,8,9],[7,9,3,2],[3,8,4,6]]
result = [[sum(a*b for a,b in zip(x_row,y_col)) for y_col in zip(*y)] for x_row in x]
for r in result:
print(r)
Change the last line in the job submission file above to read:
python matrix.py
The standard output file from this job will result in the following matrix:
[28, 56, 43, 53]
[65, 122, 59, 73]
[63, 104, 54, 60]
Link to section 'Example 3: Sine wave plot using numpy and matplotlib packages' of 'Example Python Jobs' Example 3: Sine wave plot using numpy and matplotlib packages
Save the following script as sine.py:
import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
x = np.linspace(-np.pi, np.pi, 201)
plt.plot(x, np.sin(x))
plt.xlabel('Angle [rad]')
plt.ylabel('sin(x)')
plt.axis('tight')
plt.savefig('sine.png')
Change your job submission file to submit this script and the job will output a png file and blank standard output and error files.
For more information about Python:
Managing Environments with Conda
Conda is a package manager in Anaconda that allows you to create and manage multiple environments where you can pick and choose which packages you want to use. To use Conda you must load an Anaconda module:
$ module load conda
Many packages are pre-installed in the global environment. To see these packages:
$ conda list
To create your own custom environment:
$ conda create --name MyEnvName python=3.8 FirstPackageName SecondPackageName -y
The --name option specifies that the environment created will be named MyEnvName. You can include as many packages as you require separated by a space. Including the -y option lets you skip the prompt to install the package. By default environments are created and stored in the $HOME/.conda directory.
To create an environment at a custom location:
$ conda create --prefix=$HOME/MyEnvName python=3.8 PackageName -y
To see a list of your environments:
$ conda env list
To remove unwanted environments:
$ conda remove --name MyEnvName --all
To add packages to your environment:
$ conda install --name MyEnvName PackageNames
To remove a package from an environment:
$ conda remove --name MyEnvName PackageName
Installing packages when creating your environment, instead of one at a time, will help you avoid dependency issues.
To activate or deactivate an environment you have created:
$ source activate MyEnvName
$ source deactivate MyEnvName
If you created your conda environment at a custom location using --prefix option, then you can activate or deactivate it using the full path.
$ source activate $HOME/MyEnvName
$ source deactivate $HOME/MyEnvName
To use a custom environment inside a job you must load the module and activate the environment inside your job submission script. Add the following lines to your submission script:
$ module load conda
$ source activate MyEnvName
For more information about Python:
Installing Packages
Installing Python packages in an Anaconda environment is recommended. One key advantage of Anaconda is that it allows users to install unrelated packages in separate self-contained environments. Individual packages can later be reinstalled or updated without impacting others. If you are unfamiliar with Conda environments, please check our Conda Guide.
To facilitate the process of creating and using Conda environments, we support a script (conda-env-mod) that generates a module file for an environment, as well as an optional Jupyter kernel to use this environment in a JupyterHub notebook.
You must load one of the anaconda modules in order to use this script.
$ module load conda
Step-by-step instructions for installing custom Python packages are presented below.
Link to section 'Step 1: Create a conda environment' of 'Installing Packages' Step 1: Create a conda environment
Users can use the conda-env-mod script to create an empty conda environment. This script needs either a name or a path for the desired environment. After the environment is created, it generates a module file for using it in future. Please note that conda-env-mod is different from the official conda-env script and supports a limited set of subcommands. Detailed instructions for using conda-env-mod can be found with the command conda-env-mod --help.
-
Example 1: Create a conda environment named mypackages in user's
$HOME
directory.$ conda-env-mod create -n mypackages
-
Example 2: Create a conda environment named mypackages at a custom location.
$ conda-env-mod create -p /depot/mylab/apps/mypackages
Please follow the on-screen instructions while the environment is being created. After finishing, the script will print the instructions to use this environment.
... ... ... Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done +------------------------------------------------------+ | To use this environment, load the following modules: | | module load use.own | | module load conda-env/mypackages-py3.8.5 | +------------------------------------------------------+ Your environment "mypackages" was created successfully.
Note down the module names, as you will need to load these modules every time you want to use this environment. You may also want to add the module load lines in your jobscript, if it depends on custom Python packages.
By default, module files are generated in your $HOME/privatemodules directory. The location of module files can be customized by specifying the -m /path/to/modules option to conda-env-mod.
Note: The main differences between -p and -m are: 1) -p will change the location of packages to be installed for the env and the module file will still be located at the $HOME/privatemodules directory as defined in use.own. 2) -m will only change the location of the module file. So the method to load modules created with -m and -p are different, see Example 3 for details.
- Example 3: Create a conda environment named labpackages in your group's Data Depot space and place the module file at a shared location for the group to use.
$ conda-env-mod create -p /depot/mylab/apps/labpackages -m /depot/mylab/etc/modules ... ... ... Preparing transaction: ...working... done Verifying transaction: ...working... done Executing transaction: ...working... done +-------------------------------------------------------+ | To use this environment, load the following modules: | | module use /depot/mylab/etc/modules | | module load conda-env/labpackages-py3.8.5 | +-------------------------------------------------------+ Your environment "labpackages" was created successfully.
If you used a custom module file location, you need to run the module use command as printed by the command output above.
By default, only the environment and a module file are created (no Jupyter kernel). If you plan to use your environment in a JupyterHub notebook, you need to append a --jupyter flag to the above commands.
- Example 4: Create a Jupyter-enabled conda environment named labpackages in your group's Data Depot space and place the module file at a shared location for the group to use.
$ conda-env-mod create -p /depot/mylab/apps/labpackages -m /depot/mylab/etc/modules --jupyter ... ... ... Jupyter kernel created: "Python (My labpackages Kernel)" ... ... ... Your environment "labpackages" was created successfully.
Link to section 'Step 2: Load the conda environment' of 'Installing Packages' Step 2: Load the conda environment
-
The following instructions assume that you have used conda-env-mod script to create an environment named mypackages (Examples 1 or 2 above). If you used conda create instead, please use conda activate mypackages.
$ module load use.own $ module load conda-env/mypackages-py3.8.5
Note that the conda-env module name includes the Python version that it supports (Python 3.8.5 in this example). This is same as the Python version in the conda module.
-
If you used a custom module file location (Example 3 above), please use module use to load the conda-env module.
$ module use /depot/mylab/etc/modules $ module load conda-env/labpackages-py3.8.5
Link to section 'Step 3: Install packages' of 'Installing Packages' Step 3: Install packages
Now you can install custom packages in the environment using either conda install or pip install.
Link to section 'Installing with conda' of 'Installing Packages' Installing with conda
-
Example 1: Install OpenCV (open-source computer vision library) using conda.
$ conda install opencv
-
Example 2: Install a specific version of OpenCV using conda.
$ conda install opencv=4.5.5
-
Example 3: Install OpenCV from a specific anaconda channel.
$ conda install -c anaconda opencv
Link to section 'Installing with pip' of 'Installing Packages' Installing with pip
-
Example 4: Install pandas using pip.
$ pip install pandas
-
Example 5: Install a specific version of pandas using pip.
$ pip install pandas==1.4.3
Follow the on-screen instructions while the packages are being installed. If installation is successful, please proceed to the next section to test the packages.
Note: Do NOT run Pip with the --user argument, as that will install packages in a different location and might mess up your account environment.
Link to section 'Step 4: Test the installed packages' of 'Installing Packages' Step 4: Test the installed packages
To use the installed Python packages, you must load the module for your conda environment. If you have not loaded the conda-env module, please do so following the instructions at the end of Step 1.
$ module load use.own
$ module load conda-env/mypackages-py3.8.5
- Example 1: Test that OpenCV is available.
$ python -c "import cv2; print(cv2.__version__)"
- Example 2: Test that pandas is available.
$ python -c "import pandas; print(pandas.__version__)"
If the commands finished without errors, then the installed packages can be used in your program.
Link to section 'Additional capabilities of conda-env-mod script' of 'Installing Packages' Additional capabilities of conda-env-mod script
The conda-env-mod tool is intended to facilitate creation of a minimal Anaconda environment, matching module file and optionally a Jupyter kernel. Once created, the environment can then be accessed via familiar module load command, tuned and expanded as necessary. Additionally, the script provides several auxiliary functions to help manage environments, module files and Jupyter kernels.
General usage for the tool adheres to the following pattern:
$ conda-env-mod help
$ conda-env-mod <subcommand> <required argument> [optional arguments]
where required arguments are one of
- -n|--name ENV_NAME (name of the environment)
- -p|--prefix ENV_PATH (location of the environment)
and optional arguments further modify behavior for specific actions (e.g. -m to specify alternative location for generated module files).
Given a required name or prefix for an environment, the conda-env-mod script supports the following subcommands:
- create - to create a new environment, its corresponding module file and optional Jupyter kernel.
- delete - to delete existing environment along with its module file and Jupyter kernel.
- module - to generate just the module file for a given existing environment.
- kernel - to generate just the Jupyter kernel for a given existing environment (note that the environment has to be created with a --jupyter option).
- help - to display script usage help.
Using these subcommands, you can iteratively fine-tune your environments, module files and Jupyter kernels, as well as delete and re-create them with ease. Below we cover several commonly occurring scenarios.
Note: When you try to use conda-env-mod delete, remember to include the arguments as you create the environment (i.e. -p package_location and/or -m module_location).
Link to section 'Generating module file for an existing environment' of 'Installing Packages' Generating module file for an existing environment
If you already have an existing configured Anaconda environment and want to generate a module file for it, follow appropriate examples from Step 1 above, but use the module subcommand instead of the create one. E.g.
$ conda-env-mod module -n mypackages
and follow printed instructions on how to load this module. With an optional --jupyter flag, a Jupyter kernel will also be generated.
Note that the module name mypackages should be exactly the same with the older conda environment name. Note also that if you intend to proceed with a Jupyter kernel generation (via the --jupyter flag or a kernel subcommand later), you will have to ensure that your environment has ipython and ipykernel packages installed into it. To avoid this and other related complications, we highly recommend making a fresh environment using a suitable conda-env-mod create .... --jupyter command instead.
Link to section 'Generating Jupyter kernel for an existing environment' of 'Installing Packages' Generating Jupyter kernel for an existing environment
If you already have an existing configured Anaconda environment and want to generate a Jupyter kernel file for it, you can use the kernel subcommand. E.g.
$ conda-env-mod kernel -n mypackages
This will add a "Python (My mypackages Kernel)" item to the dropdown list of available kernels upon your next login to the JupyterHub.
Note that generated Jupiter kernels are always personal (i.e. each user has to make their own, even for shared environments). Note also that you (or the creator of the shared environment) will have to ensure that your environment has ipython and ipykernel packages installed into it.
Link to section 'Managing and using shared Python environments' of 'Installing Packages' Managing and using shared Python environments
Here is a suggested workflow for a common group-shared Anaconda environment with Jupyter capabilities:
The PI or lab software manager:
-
Creates the environment and module file (once):
$ module purge $ module load conda $ conda-env-mod create -p /depot/mylab/apps/labpackages -m /depot/mylab/etc/modules --jupyter
-
Installs required Python packages into the environment (as many times as needed):
$ module use /depot/mylab/etc/modules $ module load conda-env/labpackages-py3.8.5 $ conda install ....... # all the necessary packages
Lab members:
-
Lab members can start using the environment in their command line scripts or batch jobs simply by loading the corresponding module:
$ module use /depot/mylab/etc/modules $ module load conda-env/labpackages-py3.8.5 $ python my_data_processing_script.py .....
-
To use the environment in Jupyter notebooks, each lab member will need to create his/her own Jupyter kernel (once). This is because Jupyter kernels are private to individuals, even for shared environments.
$ module use /depot/mylab/etc/modules $ module load conda-env/labpackages-py3.8.5 $ conda-env-mod kernel -p /depot/mylab/apps/labpackages
A similar process can be devised for instructor-provided or individually-managed class software, etc.
Link to section 'Troubleshooting' of 'Installing Packages' Troubleshooting
- Python packages often fail to install or run due to dependency incompatibility with other packages. More specifically, if you previously installed packages in your home directory it is safer to clean those installations.
$ mv ~/.local ~/.local.bak $ mv ~/.cache ~/.cache.bak
- Unload all the modules.
$ module purge
- Clean up PYTHONPATH.
$ unset PYTHONPATH
- Next load the modules (e.g. anaconda) that you need.
$ module load conda/2024.02-py311 $ module load use.own $ module load conda-env/2024.02-py311
- Now try running your code again.
- Few applications only run on specific versions of Python (e.g. Python 3.6). Please check the documentation of your application if that is the case.
Installing Packages from Source
We maintain several Anaconda installations. Anaconda maintains numerous popular scientific Python libraries in a single installation. If you need a Python library not included with normal Python we recommend first checking Anaconda. For a list of modules currently installed in the Anaconda Python distribution:
$ module load conda
$ conda list
# packages in environment at /apps/spack/bell/apps/anaconda/2020.02-py37-gcc-4.8.5-u747gsx:
#
# Name Version Build Channel
_ipyw_jlab_nb_ext_conf 0.1.0 py37_0
_libgcc_mutex 0.1 main
alabaster 0.7.12 py37_0
anaconda 2020.02 py37_0
...
If you see the library in the list, you can simply import it into your Python code after loading the Anaconda module.
If you do not find the package you need, you should be able to install the library in your own Anaconda customization. First try to install it with Conda or Pip. If the package is not available from either Conda or Pip, you may be able to install it from source.
Use the following instructions as a guideline for installing packages from source. Make sure you have a download link to the software (usually it will be a tar.gz
archive file). You will substitute it on the wget line below.
We also assume that you have already created an empty conda environment as described in our Python package installation guide.
$ mkdir ~/src
$ cd ~/src
$ wget http://path/to/source/tarball/app-1.0.tar.gz
$ tar xzvf app-1.0.tar.gz
$ cd app-1.0
$ module load conda
$ module load use.own
$ module load conda-env/mypackages-py3.8.5
$ python setup.py install
$ cd ~
$ python
>>> import app
>>> quit()
The "import app" line should return without any output if installed successfully. You can then import the package in your python scripts.
If you need further help or run into any issues installing a library, contact us or drop by Coffee Hour for in-person help.
For more information about Python:
R
R, a GNU project, is a language and environment for data manipulation, statistics, and graphics. It is an open source version of the S programming language. R is quickly becoming the language of choice for data science due to the ease with which it can produce high quality plots and data visualizations. It is a versatile platform with a large, growing community and collection of packages.
For more general information on R visit The R Project for Statistical Computing.
Loading Data into R
R is an environment for manipulating data. In order to manipulate data, it must be brought into the R environment. R has a function to read any file that data is stored in. Some of the most common file types like comma-separated variable(CSV) files have functions that come in the basic R packages. Other less common file types require additional packages to be installed. To read data from a CSV file into the R environment, enter the following command in the R prompt:
> read.csv(file = "path/to/data.csv", header = TRUE)
When R reads the file it creates an object that can then become the target of other functions. By default the read.csv() function will give the object the name of the .csv file. To assign a different name to the object created by read.csv enter the following in the R prompt:
> my_variable <- read.csv(file = "path/to/data.csv", header = FALSE)
To display the properties (structure) of loaded data, enter the following:
> str(my_variable)
For more functions and tutorials:
Installing R packages
Link to section 'Challenges of Managing R Packages in the Cluster Environment' of 'Installing R packages' Challenges of Managing R Packages in the Cluster Environment
- Different clusters have different hardware and softwares. So, if you have access to multiple clusters, you must install your R packages separately for each cluster.
- Each cluster has multiple versions of R and packages installed with one version of R may not work with another version of R. So, libraries for each R version must be installed in a separate directory.
- You can define the directory where your R packages will be installed using the environment variable
R_LIBS_USER
. - For your convenience, a sample ~/.Rprofile example file is provided that can be downloaded to your cluster account and renamed into
~/.Rprofile
(or appended to one) to customize your installation preferences. Detailed instructions.
Link to section 'Installing Packages' of 'Installing R packages' Installing Packages
-
Step 0: Set up installation preferences.
Follow the steps for setting up your~/.Rprofile
preferences. This step needs to be done only once. If you have created a~/.Rprofile
file previously on Gautschi, ignore this step. -
Step 1: Check if the package is already installed.
As part of the R installations on community clusters, a lot of R libraries are pre-installed. You can check if your package is already installed by opening an R terminal and entering the commandinstalled.packages()
. For example,module load r/4.4.1 R
installed.packages()["units",c("Package","Version")] Package Version "units" "0.8-1" quit()
If the package you are trying to use is already installed, simply load the library, e.g.,
library('units')
. Otherwise, move to the next step to install the package. -
Step 2: Load required dependencies. (if needed)
For simple packages you may not need this step. However, some R packages depend on other libraries. For example, thesf
package depends ongdal
andgeos
libraries. So, you will need to load the corresponding modules before installingsf
. Read the documentation for the package to identify which modules should be loaded.module load gdal module load geos
-
Step 3: Install the package.
Now install the desired package using the commandinstall.packages('package_name')
. R will automatically download the package and all its dependencies from CRAN and install each one. Your terminal will show the build progress and eventually show whether the package was installed successfully or not.R
install.packages('sf', repos="https://cran.case.edu/") Installing package into ‘/home/myusername/R/x86_64-pc-linux-gnu-library/4.4.1’ (as ‘lib’ is unspecified) trying URL 'https://cran.case.edu/src/contrib/sf_0.9-7.tar.gz' Content type 'application/x-gzip' length 4203095 bytes (4.0 MB) ================================================== downloaded 4.0 MB ... ... more progress messages ... ... ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path * DONE (sf) The downloaded source packages are in ‘/tmp/RtmpSVAGio/downloaded_packages’
- Step 4: Troubleshooting. (if needed)
If Step 3 ended with an error, you need to investigate why the build failed. Most common reason for build failure is not loading the necessary modules.
Link to section 'Loading Libraries' of 'Installing R packages' Loading Libraries
Once you have packages installed you can load them with the library()
function as shown below:
library('packagename')
The package is now installed and loaded and ready to be used in R.
Link to section 'Example: Installing dplyr' of 'Installing R packages' Example: Installing dplyr
The following demonstrates installing the dplyr
package assuming the above-mentioned custom ~/.Rprofile
is in place (note its effect in the "Installing package into" information message):
module load r
R
install.packages('dplyr', repos="http://ftp.ussg.iu.edu/CRAN/")
Installing package into ‘/home/myusername/R/gautschi/4.4.1’
(as ‘lib’ is unspecified)
...
also installing the dependencies 'crayon', 'utf8', 'bindr', 'cli', 'pillar', 'assertthat', 'bindrcpp', 'glue', 'pkgconfig', 'rlang', 'Rcpp', 'tibble', 'BH', 'plogr'
...
...
...
The downloaded source packages are in
'/tmp/RtmpHMzm9z/downloaded_packages'
library(dplyr)
Attaching package: 'dplyr'
For more information about installing R packages:
RStudio
RStudio is a graphical integrated development environment (IDE) for R. RStudio is the most popular environment for developing both R scripts and packages. RStudio is provided on most Research systems.
There are two methods to launch RStudio on the cluster: command-line and application menu icon.
Link to section 'Launch RStudio by the command-line:' of 'RStudio' Launch RStudio by the command-line:
module load gcc
module load r
module load rstudio
rstudio
Note that RStudio is a graphical program and in order to run it you must have a local X11 server running or use Thinlinc Remote Desktop environment. See the ssh X11 forwarding section for more details.
Link to section 'Launch Rstudio by the application menu icon:' of 'RStudio' Launch Rstudio by the application menu icon:
- Log into desktop.gautschi.rcac.purdue.edu with web browser or ThinLinc client
- Click on the
Applications
drop down menu on the top left corner - Choose
Cluster Software
and thenRStudio
R and RStudio are free to download and run on your local machine. For more information about RStudio:
Running R jobs
This section illustrates how to submit a small R job to a SLURM queue. The example job computes a Pythagorean triple.
Prepare an R input file with an appropriate filename, here named myjob.R:
# FILENAME: myjob.R
# Compute a Pythagorean triple.
a = 3
b = 4
c = sqrt(a*a + b*b)
c # display result
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/bash
# FILENAME: myjob.sub
module load r
# --vanilla:
# --no-save: do not save datasets at the end of an R session
R --vanilla --no-save < myjob.R
For other examples or R jobs:
Singularity
On Gautschi, Singularity functionality is provided by Apptainer - see Apptainer section for details.
Apptainer
Note: Apptainer was formerly known as Singularity and is now a part of the Linux Foundation. When migrating from Singularity see the user compatibility documentation.
Link to section 'What is Apptainer?' of 'Apptainer' What is Apptainer?
Apptainer is an open-source container platform designed to be simple, fast, and secure. It allows the portability and reproducibility of operating systems and application environments through the use of Linux containers. It gives users complete control over their environment.
Apptainer is like Docker but tuned explicitly for HPC clusters. More information is available on the project’s website.
Link to section 'Features' of 'Apptainer' Features
- Run the latest applications on an Ubuntu or Centos userland
- Gain access to the latest developer tools
- Launch MPI programs easily
- Much more
Apptainer’s user guide is available at: apptainer.org/docs/user/main/introduction.html
Link to section 'Example' of 'Apptainer' Example
Here is an example using an Ubuntu 16.04 image on Gautschi:
apptainer exec /depot/itap/singularity/ubuntu1604.img cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04 LTS"
Here is another example using a Centos 7 image:
apptainer exec /depot/itap/singularity/centos7.img cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
Link to section 'Purdue Cluster Specific Notes' of 'Apptainer' Purdue Cluster Specific Notes
All service providers will integrate Apptainer slightly differently depending on site. The largest customization will be which default files are inserted into your images so that routine services will work.
Services we configure for your images include DNS settings and account information. File systems we overlay into your images are your home directory, scratch, Data Depot, and application file systems.
Here is a list of paths:
- /etc/resolv.conf
- /etc/hosts
- /home/$USER
- /apps
- /scratch
- /depot
This means that within the container environment these paths will be present and the same as outside the container. The /apps
, /scratch
, and /depot
directories will need to exist inside your container to work properly.
Link to section 'Creating Apptainer Images' of 'Apptainer' Creating Apptainer Images
You can build on your system or straight on the cluster (you do not need root privileges to build or run the container).
You can find information and documentation for how to install and use Apptainer on your system:
We have version 1.1.6
(or newer) on the cluster. Please note that installed versions may change throughout cluster life time, so when in doubt, please check exact version with a --version
command line flag:
apptainer --version
apptainer version 1.3.3-1.el9
Everything you need on how to build a container is available from their user guide. Below are merely some quick tips for getting your own containers built for Gautschi.
You can use a Definition File to both build your container and share its specification with collaborators (for the sake of reproducibility). Here is a simplistic example of such a file:
# FILENAME: Buildfile
Bootstrap: docker
From: ubuntu:18.04
%post
apt-get update && apt-get upgrade -y
mkdir /apps /depot /scratch
To build the image itself:
apptainer build ubuntu-18.04.sif Buildfile
The challenge with this approach however is that it must start from scratch if you decide to change something. In order to create a container image iteratively and interactively, you can use the --sandbox
option.
apptainer build --sandbox ubuntu-18.04 docker://ubuntu:18.04
This will not create a flat image file but a directory tree (i.e., a folder), the contents of which are the container's filesystem. In order to get a shell inside the container that allows you to modify it, user the --writable
option.
apptainer shell --writable ubuntu-18.04
Apptainer>
You can then proceed to install any libraries, software, etc. within the container. Then to create the final image file, exit
the shell and call the build
command once more on the sandbox.
apptainer build ubuntu-18.04.sif ubuntu-18.04
Finally, copy the new image to Gautschi and run it.
GPU Usage Monitoring
Link to section 'What is it?' of 'GPU Usage Monitoring' What is it?
To ensure that GPUs are effectively utilized on our cluster, we log and store the power, memory, and utilization of every GPU on our cluster every 10 seconds. RCAC uses this data to identify users that request large amounts of GPU hours but do not actually use them.
To assist users in optimizing their usage of requested GPU resources, we have created a GPU usage monitor tool that allows users to track their GPU utilization for jobs that they have submitted. The purpose of this tool is to provide users an interface to identify jobs or GPUs that they have requested but are not being effectively utilized.
Link to section 'How to use' of 'GPU Usage Monitoring' How to use
Our GPU Usage monitor is currently implemented on Gautschi
and Gilbreth
, and is available at the following location:
/apps/external/gpu_util/get_gpu_util
Link to section 'Seeing Logged Jobs' of 'GPU Usage Monitoring' Seeing Logged Jobs
You can check all the jobs that you've run with the -u
or --user_jobs
flag:
/apps/external/gpu_util/get_gpu_util --user_jobs
This will print a table of all the GPU jobs you have submitted to the cluster, with the job ID, start and end times, and the number of GPUs that that job had allocated to it:
It should be noted that we collect GPU usage information every 3 - 3.5 hours. If you have recently submitted a job, your job may not be available yet.
Link to section 'Inspecting a specific job' of 'GPU Usage Monitoring' Inspecting a specific job
If you want to view the GPU utilization of a specific job, you can use the -j
or --job_id
flag. For example, to see the utilization of job 32083, you may run the following:
/apps/external/gpu_util/get_gpu_util --job_id 32083
This will print a table of the GPU memory and utilization for each GPU that was allocated for the specified job:

Link to section 'Plotting Data' of 'GPU Usage Monitoring' Plotting Data
It is also possible to plot the GPU utilization and memory usage over time with the -p
or the --plot
flag:
/apps/external/gpu_util/get_gpu_util --job_id 32083 --plot
This will print two graphs. The first will be a 2-minute rolling average of the GPU utilization over time, and the second will be the percentage of memory used over time. Each GPU will be colored differently:
Link to section 'Saving GPU Data' of 'GPU Usage Monitoring' Saving GPU Data
If you would like to save the raw GPU utilization and memory usage data for your job, you can use the -s
or --save
flag:
/apps/external/gpu_util/get_gpu_util --job_id 32083 --save
This will download the GPU data for a specific job as:
{job_id}_gpu_usage.csv
Link to section 'FAQ/Troubleshooting' of 'GPU Usage Monitoring' FAQ/Troubleshooting
- "Why doesn't my job show up in the list?"
- Although we log GPU utilization every 10 seconds, we only actually collect it every 3-3.5 hours. If you check later, your job should be listed.
- "Can other people see my data?"
- No, our interface only allows users to check the utilization of their own jobs.
- "Is this a good way to see the GPU Hours balance?"
- No, the primary purpose of this tool is to check that GPUs are actually being used, not to track billed GPU hours. To see the total GPU hours balance, please use the
slist
command.
- No, the primary purpose of this tool is to check that GPUs are actually being used, not to track billed GPU hours. To see the total GPU hours balance, please use the