Anvil User Guide

 

Overview of Anvil

Purdue University is the home of Anvil, a powerful new supercomputer that provides advanced computing capabilities to support a wide range of computational and data-intensive research spanning from traditional high-performance computing to modern artificial intelligence applications.

Anvil, which is funded by a $10 million award from the National Science Foundation, significantly increases the capacity available to the NSF's Extreme Science and Engineering Discovery Environment (XSEDE), which serves tens of thousands of researchers across the U.S., and in which Purdue has been a partner for the past nine years. Anvil enters production in 2021 and serve researchers for five years. Additional funding from the NSF supports Anvil's operations and user support.

The name "Anvil" reflects the Purdue Boilermakers' strength and workmanlike focus on producing results, and the Anvil supercomputer enables important discoveries across many different areas of science and engineering. Anvil also serves as an experiential learning laboratory for students to gain real-world experience using computing for their science, and for student interns to work with the Anvil team for construction and operation. We will be training the research computing practitioners of the future. Learn more about Anvil's mission in the Anvil press release.

Anvil is built in partnership with Dell and AMD and consists of 1,000 nodes with two 64-core AMD Epyc "Milan" processors each and will deliver over 1 billion CPU core hours to XSEDE each year, with a peak performance of 5.3 petaflops. Anvil's nodes are interconnected with 100 Gbps Mellanox HDR InfiniBand. The supercomputer ecosystem also includes 32 large memory nodes, each with 1 TB of RAM, and 16 nodes each with four NVIDIA A100 Tensor Core GPUs providing 1.5 PF of single-precision performance to support machine learning and artificial intelligence applications.

Anvil is funded under NSF award number 2005632. Carol Song is the principal investigator and project director. Preston Smith, executive director of Research Computing, Xiao Zhu, computational scientist and senior research scientist, and Rajesh Kalyanam, data scientist, software engineer, and research scientist, are all co-PIs on the project.

Link to section 'Anvil Specifications' of 'Overview of Anvil' Anvil Specifications

All Anvil nodes have 128 processor cores, 256 GB to 1 TB of RAM, and 100 Gbps Infiniband interconnects.

Anvil Login
Login Number of Nodes Processors per Node Cores per Node Memory per Node Retires in
  8 Two Milan CPUs @ 2.0GHz 32 256 GB 2026
Anvil Sub-Clusters
Sub-Cluster Number of Nodes Processors per Node Cores per Node Memory per Node Retires in
A 1,000 Two Milan CPUs @ 2.0GHz 128 256 GB 2026
B 32 Two Milan CPUs @ 2.0GHz 128 1 TB 2026
C 16 Two Milan CPUs @ 2.0GHz + Four NVIDIA A100 GPUs 128 256 GB 2026

Anvil nodes run CentOS 8 and use Slurm (Simple Linux Utility for Resource Management) as the batch scheduler for resource and job management. The application of operating system patches will occur as security needs dictate. All nodes allow for unlimited stack usage, as well as unlimited core dump size (though disk space and server quotas may still be a limiting factor).

Accessing the System

Link to section 'Accounts on Anvil' of 'Accessing the System' Accounts on Anvil

Link to section 'Obtaining an Account' of 'Accessing the System' Obtaining an Account

As an XSEDE computing resource, Anvil is accessible to XSEDE users who are given an allocation on the system. To obtain an account, users may submit a proposal through the XSEDE Allocation Request System.

Interested parties may contact the XSEDE Help Desk for help with an Anvil proposal.

Logging In

Anvil will be accessible via the XSEDE Single Sign-On (SSO) hub.

To login to the XSEDE SSO hub, use your SSH client to start an SSH session on login.xsede.org with your XSEDE User Portal username and password:

localhost$ ssh -l XUPusername login.xsede.org

XSEDE now requires that you use the XSEDE Duo service for additional authentication, you will be prompted to authenticate yourself further using Duo and your Duo client app, token, or other contact methods. Consult Multi-Factor Authentication with Duo for account setup instructions.

Once logged into the hub, then use the gsissh utility to login to Anvil where you have an account.

[XUPusername@ssohub ~]$ gsissh anvil

When reporting a problem to the help desk, please execute the gsissh command with the -vvv option and include the verbose output in your problem description.

Check Allocation Usage

To keep track of the usage of the allocation by your project team, you can use mybalance:

x-anvilusername@login01:~ $ mybalance

Allocation          Type  SU Limit   SU Usage  SU Usage  SU Balance
Account                             (account)    (user)
===============  =======  ========  ========= =========  ==========
ascxxxxxx           CPU    1000.0       95.7       0.0       904.3

You can also check the allocation usage through XSEDE User Portal.

System Architecture

Link to section 'Compute Nodes' of 'System Architecture' Compute Nodes

Compute Node Specifications
Model: 3rd Gen AMD EPYC™ CPUs (AMD EPYC 7763)
Sockets per node: 2
Cores per socket: 64
Cores per node: 128
Hardware threads per core: 1
Hardware threads per node: 128
Clock rate: 2.45GHz (3.5GHz max boost)
RAM: Regular compute node: 256 GB DDR4-3200
Large memory node: (32 nodes with 1TB DDR4-3200)
Cache: L1d cache: 32K/core
L1i cache: 32K/core
L2 cache: 512K/core
L3 cache: 32768K
Local storage: 240GB local disk

Link to section 'Login Nodes' of 'System Architecture' Login Nodes

Login Node Specifications
Number of Nodes Processors per Node Cores per Node Memory per Node
8 3rd Gen AMD EPYC™ 7543 CPU 32 512 GB

Link to section 'Specialized Nodes' of 'System Architecture' Specialized Nodes

Specialized Node Specifications
Sub-Cluster Number of Nodes Processors per Node Cores per Node Memory per Node
B 32 Two 3rd Gen AMD EPYC™ 7763 CPUs 128 1 TB
C 16 Two 3rd Gen AMD EPYC™ 7763 CPUs + Four NVIDIA A100 GPUs 128 512 GB

Link to section 'Network' of 'System Architecture' Network

All nodes, as well as the scratch storage system are interconnected by an oversubscribed (3:1 fat tree) HDR InfiniBand interconnect. The nominal per-node bandwidth is 100 Gbps, with message latency as low as 0.90 microseconds. The fabric is implemented as a two-stage fat tree. Nodes are directly connected to Mellanox QM8790 switches with 60 HDR100 links down to nodes and 10 links to spine switches.

Running Jobs

Users familiar with the Linux command line may use standard job submission utilities to manage and run jobs on the Anvil compute nodes.

Accessing the Compute Nodes

Anvil uses the Slurm Workload Manager for job scheduling and management. With Slurm, a user requests resources and submits a job to a queue. The system takes jobs from queues, allocates the necessary compute nodes, and executes them. While users will typically SSH to an Anvil login node to access the Slurm job scheduler, they should note that Slurm should always be used to submit their work as a job rather than run computationally intensive jobs directly on a login node. All users share the login nodes, and running anything but the smallest test job will negatively impact everyone's ability to use Anvil.

Anvil is designed to serve the moderate-scale computation and data needs of the majority of XSEDE users. Users with allocations can submit to a variety of queues with varying job size and walltime limits. Separate sets of queues are utilized for the CPU, GPU, and large memory nodes. Typically, queues with shorter walltime and smaller job size limits will feature faster turnarounds. Some additional points to be aware of regarding the Anvil queues are:

  • Anvil provides a debug queue for testing and debugging codes.
  • Anvil supports shared-node jobs (more than one job on a single node). Many applications are serial or can only scale to a few cores. Allowing shared nodes improves job throughput, provides higher overall system utilization, and allows more users to run on Anvil.
  • Anvil supports long-running jobs - run times can be extended to four days for jobs using up to 16 full nodes.
  • The maximum allowable job size on Anvil is 7,168 cores. To run larger jobs, submit a consulting ticket to discuss with Anvil support.
  • Shared-node queues will be utilized for managing jobs on the GPU and large memory nodes.

Job Accounting

The charge unit for Anvil is the Service Unit (SU). This corresponds to the equivalent use of one compute core utilizing less than or equal to approximately 2G of data in memory for one hour, or 1 GPU for 1 hour. Keep in mind that your charges are based on the resources that are tied up by your job and do not necessarily reflect how the resources are used. Charges on jobs submitted to the shared queues are based on the number of cores and the fraction of the memory requested, whichever is larger. Jobs submitted as node-exclusive will be charged for all 128 cores, whether the resources are used or not. Jobs submitted to the large memory nodes will be charged 4 SU per compute core (4x standard node charge). The minimum charge for any job is 1 SU. Filesystem storage is not charged.

Queues

Anvil provides different queues with varying job size and walltime. There are also limits on the number of jobs queued and running on a per allocation and queue basis. Queues and limits are subject to change based on the evaluation from the Early User Program.

Anvil Production Queues
Queue Name Node Type Max Nodes per Job Max Cores per Job Max Duration Max running Jobs in Queue Charging factor
debug regular 2 nodes 256 cores 2 hrs 1 1
gpu-debug gpu 1 nodes 2 gpus 0.5 hrs 1 1
standard regular 16 nodes 2,048 cores 96 hrs 50 1
wide regular 56 nodes 7,168 cores 12 hrs 5 1
shared regular 1 nodes 128 cores 96 hrs 4000 1
highmem large-memory 1 nodes 128 cores 48 hrs 2 4
gpu gpu 2 nodes 4 gpus 48 hrs 2 1

Batch Jobs

Link to section 'Job Submission Script' of 'Batch Jobs' Job Submission Script

To submit work to a Slurm queue, you must first create a job submission file. This job submission file is essentially a simple shell script. It will set any required environment variables, load any necessary modules, create or modify files and directories, and run any applications that you need:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

# Loads Matlab and sets the application up
module load matlab

# Change to the directory from which you originally submitted this job.
cd $SLURM_SUBMIT_DIR

# Runs a Matlab script named 'myscript'
matlab -nodisplay -singleCompThread -r myscript

The standard Slurm environment variables that can be used in the job submission file are listed in the table below:

Job Script Environment Variables
Name Description
SLURM_SUBMIT_DIR Absolute path of the current working directory when you submitted this job
SLURM_JOBID Job ID number assigned to this job by the batch system
SLURM_JOB_NAME Job name supplied by the user
SLURM_JOB_NODELIST Names of nodes assigned to this job
SLURM_SUBMIT_HOST Hostname of the system where you submitted this job
SLURM_JOB_PARTITION Name of the original queue to which you submitted this job

Once your script is prepared, you are ready to submit your job.

Link to section 'Submitting a Job' of 'Batch Jobs' Submitting a Job

Once you have a job submission file, you may submit this script to SLURM using the sbatch command. Slurm will find, or wait for, available resources matching your request and run your job there.

To submit your job to one compute node with one task:

$ sbatch --nodes=1 --ntasks=1 myjobsubmissionfile

By default, each job receives 30 minutes of wall time, or clock time. If you know that your job will not need more than a certain amount of time to run, request less than the maximum wall time, as this may allow your job to run sooner. To request the 1 hour and 30 minutes of wall time:

$ sbatch -t 1:30:00 --nodes=1  --ntasks=1 myjobsubmissionfile

Each compute node in Anvil has 128 processor cores. In some cases, you may want to request multiple nodes. To utilize multiple nodes, you will need to have a program or code that is specifically programmed to use multiple nodes such as with MPI. Simply requesting more nodes will not make your work go faster. Your code must utilize all the cores to support this ability. To request 2 compute nodes with 256 tasks:

$ sbatch --nodes=2 --ntasks=256 myjobsubmissionfile

If more convenient, you may also specify any command line options to sbatch from within your job submission file, using a special form of comment:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

#SBATCH -A myallocation
#SBATCH -p queue-name # the default queue is "standard" queue
#SBATCH --nodes=1
#SBATCH --ntasks=1 
#SBATCH --time=1:30:00
#SBATCH --job-name myjobname

module purge # Unload all loaded modules and reset everything to original state.
module load ...
...
module list # List currently loaded modules.
# Print the hostname of the compute node on which this job is running.
/bin/hostname

If an option is present in both your job submission file and on the command line, the option on the command line will take precedence.

After you submit your job with sbatch, it may wait in the queue for minutes, hours, or even days. How long it takes for a job to start depends on the specific queue, the available resources, and time requested, and other jobs that are already waiting in that queue. It is impossible to say for sure when any given job will start. For best results, request no more resources than your job requires.

Once your job is submitted, you can monitor the job status, wait for the job to complete, and check the job output.

Link to section 'Checking Job Status' of 'Batch Jobs' Checking Job Status

Once a job is submitted there are several commands you can use to monitor the progress of the job. To see your jobs, use the squeue -u command and specify your username.

$ squeue -u myusername
   JOBID   PARTITION   NAME     USER       ST    TIME   NODES   NODELIST(REASON)
   188     standard    job1   myusername   R     0:14      2    a[010-011]
   189     standard    job2   myusername   R     0:15      1    a012

To retrieve useful information about your queued or running job, use the scontrol show job command with your job's ID number.

$ scontrol show job 189
JobId=189 JobName=myjobname
   UserId=myusername GroupId=mygroup MCS_label=N/A
   Priority=103076 Nice=0 Account=myacct QOS=normal
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=0 Reboot=0 ExitCode=0:0
   RunTime=00:01:28 TimeLimit=00:30:00 TimeMin=N/A
   SubmitTime=2021-10-04T14:59:52 EligibleTime=2021-10-04T14:59:52
   AccrueTime=Unknown
   StartTime=2021-10-04T14:59:52 EndTime=2021-10-04T15:29:52 Deadline=N/A
   SuspendTime=None SecsPreSuspend=0 LastSchedEval=2021-10-04T14:59:52 Scheduler=Main
   Partition=standard AllocNode:Sid=login05:1202865
   ReqNodeList=(null) ExcNodeList=(null)
   NodeList=a010
   BatchHost=a010
   NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
   TRES=cpu=1,mem=257526M,node=1,billing=1
   Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
   MinCPUsNode=1 MinMemoryNode=257526M MinTmpDiskNode=0
   Features=(null) DelayBoot=00:00:00
   OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
   Command=(null)
   WorkDir=/home/myusername/jobdir
   Power=
  • JobState lets you know if the job is Pending, Running, Completed, or Held.
  • RunTime and TimeLimit will show how long the job has run and its maximum time.
  • SubmitTime is when the job was submitted to the cluster.
  • The job's number of Nodes, Tasks, Cores (CPUs) and CPUs per Task are shown.
  • WorkDir is the job's working directory.
  • StdOut and Stderr are the locations of stdout and stderr of the job, respectively.
  • Reason will show why a PENDING job isn't running. The above error says that it has been requested to start at a specific, later time.

For historic (completed) jobs, you can use the jobinfo command. While not as detailed as scontrol output, it can also report information on jobs that are no longer active.

Link to section 'Checking Job Output' of 'Batch Jobs' Checking Job Output

Once a job is submitted, and has started, it will write its standard output and standard error to files that you can read.

SLURM catches output written to standard output and standard error - what would be printed to your screen if you ran your program interactively. Unless you specified otherwise, SLURM will put the output in the directory where you submitted the job in a file named slurm- followed by the job id, with the extension out. For example slurm-3509.out. Note that both stdout and stderr will be written into the same file, unless you specify otherwise.

If your program writes its own output files, those files will be created as defined by the program. This may be in the directory where the program was run, or may be defined in a configuration or input file. You will need to check the documentation for your program for more details.

Link to section 'Redirecting Job Output' of 'Batch Jobs' Redirecting Job Output

It is possible to redirect job output to somewhere other than the default location with the --error and --output directives:

#! /bin/sh -l
#SBATCH --output=/path/myjob.out
#SBATCH --error=/path/myjob.out

# This job prints "Hello World" to output and exits
echo "Hello World"

Link to section 'Holding a Job' of 'Batch Jobs' Holding a Job

Sometimes you may want to submit a job but not have it run just yet. You may be wanting to allow labmates to cut in front of you in the queue - so hold the job until their jobs have started, and then release yours.

To place a hold on a job before it starts running, use the scontrol hold job command:

$ scontrol hold job  myjobid

Once a job has started running it can not be placed on hold.

To release a hold on a job, use the scontrol release job command:

$ scontrol release job  myjobid

Link to section 'Job Dependencies' of 'Batch Jobs' Job Dependencies

Dependencies are an automated way of holding and releasing jobs. Jobs with a dependency are held until the condition is satisfied. Once the condition is satisfied jobs only then become eligible to run and must still queue as normal.

Job dependencies may be configured to ensure jobs start in a specified order. Jobs can be configured to run after other job state changes, such as when the job starts or the job ends.

These examples illustrate setting dependencies in several ways. Typically dependencies are set by capturing and using the job ID from the last job submitted.

To run a job after job myjobid has started:

$ sbatch --dependency=after:myjobid myjobsubmissionfile

To run a job after job myjobid ends without error:

$ sbatch --dependency=afterok:myjobid myjobsubmissionfile

To run a job after job myjobid ends with errors:

$ sbatch --dependency=afternotok:myjobid myjobsubmissionfile

To run a job after job myjobid ends with or without errors:

$ sbatch --dependency=afterany:myjobid myjobsubmissionfile

To set more complex dependencies on multiple jobs and conditions:

$ sbatch --dependency=after:myjobid1:myjobid2:myjobid3,afterok:myjobid4 myjobsubmissionfile

Link to section 'Canceling a Job' of 'Batch Jobs' Canceling a Job

To stop a job before it finishes or remove it from a queue, use the scancel command:

$ scancel myjobid

Interactive Jobs

In addition to the ThinLinc and OnDemand interfaces, users can also choose to run interactive jobs on compute nodes, to obtain a shell that they can interact with. This gives users the ability to type commands or use a graphical interface as if they were on a login node.

To submit an interactive job, use sinteractive to run a login shell on allocated resources.

sinteractive accepts most of the same resource requests as sbatch, so to request a login shell in the compute queue while allocating 2 nodes and 256 total cores, you might do:

$ sinteractive -N2 -n256 -A oneofyourallocations

To quit your interactive job:

exit or Ctrl-D

Example Jobs

A number of example jobs are available for you to look over and adapt to your own needs. They demonstrate the basics of SLURM jobs and are designed to cover common job request scenarios.

Serial job in Standard queue

This shows an example of a job submission file of the serial programs:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

#SBATCH -A myallocation # Allocation name (required if more than 1 available)
#SBATCH --nodes=1       # Total # of nodes (must be 1 for serial job)
#SBATCH --ntasks=1      # Total # of MPI tasks (should be 1 for serial job)
#SBATCH --time=1:30:00  # Total run time limit (hh:mm:ss)
#SBATCH -J myjobname    # Job name
#SBATCH -o myjob.o%j    # Name of stdout output file
#SBATCH -e myjob.e%j    # Name of stderr error file
#SBATCH -p standard     # Queue (partition) name
#SBATCH --mail-user=useremailaddress
#SBATCH --mail-type=all # Send email to above address at begin and end of job

# Manage processing environment, load compilers and applications.
module purge
module load compilername
module load applicationname
module list

# Launch serial code
./mycode.exe

MPI job in Standard queue

An MPI job is a set of processes that take advantage of multiple compute nodes by communicating with each other. OpenMPI and Intel MPI (IMPI) are implementations of the MPI standard.

This shows an example of a job submission file of the MPI programs:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

#SBATCH -A myallocation # Allocation name (required if more than 1 available)
#SBATCH --nodes=2       # Total # of nodes 
#SBATCH --ntasks=256    # Total # of MPI tasks
#SBATCH --time=1:30:00  # Total run time limit (hh:mm:ss)
#SBATCH -J myjobname    # Job name
#SBATCH -o myjob.o%j    # Name of stdout output file
#SBATCH -e myjob.e%j    # Name of stderr error file
#SBATCH -p standard     # Queue (partition) name
#SBATCH --mail-user=useremailaddress
#SBATCH--mail-type=all # Send email to above address at begin and end of job

# Manage processing environment, load compilers and applications.
module purge
module load compilername
module load mpilibrary
module load applicationname
module list

# Launch MPI code
srun -n $SLURM_NTASKS ./mycode.exe

SLURM can run an MPI program with the srun command. The number of processes is requested with the -n option. If you do not specify the -n option, it will default to the total number of processor cores you request from SLURM.

If the code is built with OpenMPI, it can be run with a simple srun -n command. If it is built with Intel IMPI, then you also need to add the --mpi=pmi2 option: srun --mpi=pmi2 -n 256 ./mycode.exe in this example.

  • Invoking an MPI program on anvil with ./program is typically wrong, since this will use only one MPI process and defeat the purpose of using MPI. Unless that is what you want (rarely the case), you should use srun or mpiexec to invoke an MPI program.
  • OpenMP job in Standard queue

    A shared-memory job is a single process that takes advantage of a multi-core processor and its shared memory to achieve parallelization.

    When running OpenMP programs, all threads must be on the same compute node to take advantage of shared memory. The threads cannot communicate between nodes.

    To run an OpenMP program, set the environment variable OMP_NUM_THREADS to the desired number of threads. This should almost always be equal to the number of cores on a compute node. You may want to set to another appropriate value if you are running several processes in parallel in a single job or node.

    This example shows how to submit an OpenMP program:

    #!/bin/sh -l
    # FILENAME:  myjobsubmissionfile
    
    #SBATCH -A myallocation # Allocation name (required if more than 1 available)
    #SBATCH --nodes=1       # Total # of nodes (must be 1 for OpenMP job)
    #SBATCH --ntasks=128    # Total # of MPI tasks
    #SBATCH --time=1:30:00  # Total run time limit (hh:mm:ss)
    #SBATCH -J myjobname    # Job name
    #SBATCH -o myjob.o%j    # Name of stdout output file
    #SBATCH -e myjob.e%j    # Name of stderr error file
    #SBATCH -p standard     # Queue (partition) name
    #SBATCH --mail-user=useremailaddress
    #SBATCH --mail-type=all # Send email to above address at begin and end of job
    
    # Manage processing environment, load compilers and applications.
    module purge
    module load compilername
    module load applicationname
    module list
    
    # Set thread count (default value is 1).
    export OMP_NUM_THREADS=128
    
    # Launch OpenMP code
    ./mycode.exe
    

    If an OpenMP program uses a lot of memory and 128 threads use all of the memory of the compute node, use fewer processor cores (OpenMP threads) on that compute node.

    Hybrid job in Standard queue

    A hybrid program combines both MPI and shared-memory to take advantage of compute clusters with multi-core compute nodes. Libraries for OpenMPI and Intel MPI (IMPI) and compilers which include OpenMP for C, C++, and Fortran are available.

    #!/bin/sh -l
    # FILENAME:  myjobsubmissionfile
    
    #SBATCH -A myallocation       # Allocation name (required if more than 1 available)
    #SBATCH --nodes=2             # Total # of nodes 
    #SBATCH --ntasks=2            # Total # of MPI tasks
    #SBATCH --cpus-per-task=128   # Number of CPUs per each MPI task
    #SBATCH --time=1:30:00        # Total run time limit (hh:mm:ss)
    #SBATCH -J myjobname          # Job name
    #SBATCH -o myjob.o%j          # Name of stdout output file
    #SBATCH -e myjob.e%j          # Name of stderr error file
    #SBATCH -p standard           # Queue (partition) name
    #SBATCH --mail-user=useremailaddress
    #SBATCH --mail-type=all       # Send email at begin and end of job
    
    # Manage processing environment, load compilers and applications.
    module purge
    module load compilername
    module load mpilibrary
    module load applicationname
    module list
    
    # Set thread count (default value is 1).
    export OMP_NUM_THREADS=128
    
    # Launch MPI code
    srun -n $SLURM_NTASKS ./mycode.exe 
    

    GPU job in GPU queue

    The Anvil cluster nodes contain GPUs that support CUDA and OpenCL. See the detailed hardware overview for the specifics on the GPUs in Anvil or use sfeatures command to see the detailed hardware overview..

    This section illustrates how to use SLURM to submit a simple GPU program.

    #!/bin/sh -l
    # FILENAME:  myjobsubmissionfile
    
    #SBATCH -A myallocation       # allocation name (required if more than 1 available)
    #SBATCH --nodes=1            # Total # of nodes 
    #SBATCH --ntasks-per-node=1   # Number of MPI ranks per node (one rank per GPU)
    #SBATCH --gres=gpu:1          # Number of GPUs per node
    #SBATCH --time=1:30:00        # Total run time limit (hh:mm:ss)
    #SBATCH -J myjobname          # Job name
    #SBATCH -o myjob.o%j          # Name of stdout output file
    #SBATCH -e myjob.e%j          # Name of stderr error file
    #SBATCH -p gpu                # Queue (partition) name
    #SBATCH --mail-user=useremailaddress
    #SBATCH --mail-type=all       # Send email to above address at begin and end of job
    
    # Manage processing environment, load compilers, and applications.
    module purge
    module load modtree/gpu
    module load applicationname
    module list
    
    # Launch GPU code
    ./mycode.exe 
    

    NGC GPU container job in GPU queue

    Link to section 'What is NGC?' of 'NGC GPU container job in GPU queue' What is NGC?

    Nvidia GPU Cloud (NGC) is a GPU-accelerated cloud platform optimized for deep learning and scientific computing. NGC offers a comprehensive catalogue of GPU-accelerated containers, so the application runs quickly and reliably in the high-performance computing environment. Purdue Research Computing deployed NGC to extend the cluster capabilities and to enable powerful software and deliver the fastest results. By utilizing Singularity and NGC, users can focus on building lean models, producing optimal solutions, and gathering faster insights. For more information, please visit https://www.nvidia.com/en-us/gpu-cloud and NGC software catalog.

    Link to section ' Getting Started ' of 'NGC GPU container job in GPU queue' Getting Started

    Users can download containers from the NGC software catalog and run them directly using Singularity instructions from the corresponding container’s catalog page.

    In addition, Research Computing provides a subset of pre-downloaded NGC containers wrapped into convenient software modules. These modules wrap underlying complexity and provide the same commands that are expected from non-containerized versions of each application.

    On Anvil, type the command below to see the lists of NGC containers we deployed.

    $ module load ngc 
    $ module avail 
    

    Once module loaded ngc, you can run your code as with normal non-containerized applications. This section illustrates how to use SLURM to submit a job with a containerized NGC program.

    #!/bin/sh -l
    # FILENAME:  myjobsubmissionfile
    
    #SBATCH -A myallocation       # allocation name (required if more than 1 available)
    #SBATCH --nodes=1             # Total # of nodes 
    #SBATCH --ntasks-per-node=1   # Number of MPI ranks per node (one rank per GPU)
    #SBATCH --gres=gpu:1          # Number of GPUs per node
    #SBATCH --time=1:30:00        # Total run time limit (hh:mm:ss)
    #SBATCH -J myjobname          # Job name
    #SBATCH -o myjob.o%j          # Name of stdout output file
    #SBATCH -e myjob.e%j          # Name of stderr error file
    #SBATCH -p gpu                # Queue (partition) name
    #SBATCH --mail-user=useremailaddress
    #SBATCH --mail-type=all       # Send email to above address at begin and end of job
    
    # Manage processing environment, load compilers and applications.
    module load ngc
    module load applicationname
    module list
    
    # Launch GPU code
    ./mycode.exe
    

    Managing and Transferring Files

    File Systems

    Anvil provides users with separate home, scratch, and project areas for managing files. These will be accessible via the $HOME, $SCRATCH, and $PROJECT environment variables. Each file system is available from all Anvil nodes but has different purge policies and ideal use cases (see table below). . Users in the same allocation will share access to the data in the $PROJECT space, which will be created upon request for each allocation.

    $SCRATCH is a high-performance, internally resilient GPFS parallel file system with 10 PB of usable capacity, configured to deliver up to 150 GB/s bandwidth.

    Anvil File Systems
    File System Mount Point Quota Snapshots Purpose Purge policy
    Anvil ZFS /home 25 GB Full schedule* Home directories: area for storing personal software, scripts, compiling, editing, etc. Not purged
    Anvil ZFS /apps N/A Weekly* Applications
    Anvil GPFS /anvil N/A No
    Anvil GPFS /anvil/scratch 100 TB No User scratch: area for job I/O activity, temporary storage Files older than 30-day (access time) will be purged
    Anvil GPFS /anvil/projects 5 TB Full schedule* Per allocation: area for shared data in a project, common datasets and software installation Not purged while allocation is active. Removed 90 days after allocation expiration
    Anvil GPFS /anvil/datasets N/A Weekly* Common data sets (not allocated to users)
    Versity N/A (Globus) 20 TB No Tape storage per allocation

    * Full schedule keeps nightly snapshots for 7 days, weekly snapshots for 3 weeks, and monthly snapshots for 2 months.

    Transferring Files

    Anvil supports several methods for file transfer to and from the system. Users can transfer files between Anvil and Linux-based systems or Mac using either scp or rsync. Windows SSH clients typically include scp-based file transfer capabilities.

    SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH protocol. SCP is available as a protocol choice in some graphical file transfer programs and also as a command line program on most Linux, Unix, and Mac OS X systems. SCP can copy single files, but will also recursively copy directory contents if given a directory name. SSH Keys is required for SCP.

    Rsync, or Remote Sync, is a free and efficient command-line tool that lets you transfer files and directories to local and remote destinations. It allows to copy only the changes from the source and offers customization, use for mirroring, performing backups, or migrating data between different filesystems. SSH Keys is required for Rsync.

    Globus is a powerful and easy to use file transfer and sharing service for transferring files virtually anywhere. It works between any XSEDE and non-XSEDE sites running Globus, and it connects any of these research systems to personal systems. You may use Globus to connect to your home, scratch, and project storage directories on Anvil. Since Globus is web-based, it works on any operating system that is connected to the internet. The Globus Personal client is available on Windows, Linux, and Mac OS X. It is primarily used as a graphical means of transfer but it can also be used over the command line. More details can be found at XSEDE Data Transfer & Management.

    Software

    Module System

    The Anvil cluster uses Lmod to manage the user environment, so users have access to the necessary software packages and versions to conduct their research activities. The associated module command can be used to load applications and compilers, making the corresponding libraries and environment variables automatically available in the user environment.

    Lmod is a hierarchical module system, meaning a module can only be loaded after loading the necessary compilers and MPI libraries that it depends on. This helps avoid conflicting libraries and dependencies being loaded at the same time. A list of all available modules on the system can be found with the module spider command:

    $ module spider # list all modules, even those not available due to incompatible with currently loaded modules
    
    -----------------------------------------------------------------------------------
    The following is a list of the modules and extensions currently available:
    -----------------------------------------------------------------------------------
      amdblis: amdblis/3.0
      amdfftw: amdfftw/3.0
      amdlibflame: amdlibflame/3.0
      amdlibm: amdlibm/3.0
      amdscalapack: amdscalapack/3.0
      anaconda: anaconda/2021.05-py38
      aocc: aocc/3.0
    
    Lines 1-45
    

    The module spider command can also be used to search for specific module names.

    $ module spider intel # all modules with names containing 'intel'
    -----------------------------------------------------------------------------------
      intel:
    -----------------------------------------------------------------------------------
         Versions:
            intel/19.0.5.281
            intel/19.1.3.304
         Other possible modules matches:
            intel-mkl
    -----------------------------------------------------------------------------------
    $ module spider intel/19.1.3.304 # additional details on a specific module
    -----------------------------------------------------------------------------------
      intel: intel/19.1.3.304
    -----------------------------------------------------------------------------------
    
        This module can be loaded directly: module load intel/19.1.3.304
    
        Help:
          Intel Parallel Studio.
    

    When users log into Anvil, a default compiler (GCC), MPI libraries (OpenMPI), and runtime environments (e.g., Cuda on GPU-nodes) are automatically loaded into the user environment. It is recommended that users explicitly specify which modules and which versions are needed to run their codes in their job scripts via the module load command. Users are advised not to insert module load commands in their bash profiles, as this can cause issues during initialization of certain software (e.g. Thinlinc).

    When users load a module, the module system will automatically replace or deactivate modules to ensure the packages you have loaded are compatible with each other. Following example shows that the module system automatically unload the default Intel compiler version to a user-specified version:

    $ module load intel # load default version of Intel compiler
    $ module list # see currently loaded modules
    
    Currently Loaded Modules:
      1) intel/19.0.5.281
    
    $ module load intel/19.1.3.304 # load a specific version of Intel compiler
    $ module list # see currently loaded modules
    
    The following have been reloaded with a version change:
      1) intel/19.0.5.281 => intel/19.1.3.304
    

    Most modules on Anvil include extensive help messages, so users can take advantage of the module help command to find information about a particular application or module. Every module also contains two environment variables named $RCAC_APPNAME_ROOT and $RCAC_APPNAME_VERSION identifying its installation prefix and its version. Users are encouraged to use generic environment variables such as CC, CXX, FC, MPICC, MPICXX etc. available through the compiler and MPI modules while compiling their code.

    Link to section 'Some other common module commands:' of 'Module System' Some other common module commands:

    To unload a module

    $ module unload mymodulename

    To unload all loaded modules and reset everything to original state.

    $ module purge

    To see all available modules that are compatible with current loaded modules

    $ module avail

    To display information about a specified module, including environment changes, dependencies, software version and path.

    $ module show mymodulename

    Compiling, performance, and optimization on Anvil

    Anvil CPU nodes have GNU, Intel, and AOCC (AMD) compilers available along with multiple MPI implementations (OpenMPI and Intel MPI). Anvil GPU nodes will also provide the PGI compiler. Users may want to note the following AMD Milan specific optimization options that can help improve the performance of your code on Anvil:

    1. The majority of the applications on Anvil will be built using gcc/10.2.0 which features an AMD Milan specific optimization flag (-march=znver2).

    2. AMD Milan CPUs support the Advanced Vector Extensions 2 (AVX2) vector instructions set. GNU, Intel, and AOCC compilers all have flags to support AVX2. Using AVX2, up to eight floating point operations can be executed per cycle per core, potentially doubling the performance relative to non-AVX2 processors running at the same clock speed.

    3. In order to enable AVX2 support, when compiling your code, use the -march=znver2 flag (for GCC 10.2, Clang and AOCC compilers) or -march=core-avx2 (for Intel compilers and GCC prior to 9.3).

    Other Software Usage Notes:

    1. Use the same environment that you compile the code to run your executables. When switching between compilers for different applications, make sure that you load the appropriate modules before running your executables.

    2. Explicitly set the optimization level in your makefiles or compilation scripts. Most well written codes can safely use the highest optimization level (-O3), but many compilers set lower default levels (e.g. GNU compilers use the default -O0, which turns off all optimizations).

    3. Turn off debugging, profiling, and bounds checking when building executables intended for production runs as these can seriously impact performance. These options are all disabled by default. The flag used for bounds checking is compiler dependent, but the debugging (-g) and profiling (-pg) flags tend to be the same for all major compilers.

    4. Some compiler options are the same for all available compilers on Anvil (e.g. "-o"), while others are different. Many options are available in one compiler suite but not the other. For example, Intel, PGI, and GNU compilers use the -qopenmp, -mp, and -fopenmp flags, respectively, for building OpenMP applications.

    5. MPI compiler wrappers (e.g. mpicc, mpif90) all call the appropriate compilers and load the correct MPI libraries depending on the loaded modules. While the same names may be used for different compilers, keep in mind that these are completely independent scripts.

    For Python users, Anvil provides two Python distributions: 1) a natively compiled Python module with a small subset of essential numerical libraries which are optimized for the AMD Milan architecture and 2) binaries distributed through Anaconda. Users are recommended to use virtual environments for installing and using additional Python packages.

    A broad range of application modules from various science and engineering domains will be installed on Anvil, including mathematics and statistical modeling tools, visualization software, computational fluid dynamics codes, molecular modeling packages, and debugging tools.

    In addition, Singularity will be supported on Anvil and Nvidia GPU Cloud containers are available on Anvil GPU nodes.

    Compiling Source code

    This section provides some examples of compiling source code on Anvil.

    Compiling Serial Programs

    A serial program is a single process which executes as a sequential stream of instructions on one processor core. Compilers capable of serial programming are available for C, C++, and versions of Fortran.

    Here are a few sample serial programs:

    • serial_hello.f
    • serial_hello.f90
    • serial_hello.f95
    • serial_hello.c
    • serial_hello.cpp
    • To load a compiler, enter one of the following:

      $ module load intel
      $ module load gcc
      The following table illustrates how to compile your serial program:
      Language Intel Compiler GNU Compiler AOCC Compiler
      Fortran 77
      $ ifort myprogram.f -o myprogram
      $ gfortran myprogram.f -o myprogram
      $ flang program.f -o program
      Fortran 90
      $ ifort myprogram.f90 -o myprogram
      $ gfortran myprogram.f90 -o myprogram
      $ flang program.f90 -o program
      Fortran 95
      $ ifort myprogram.f90 -o myprogram
      $ gfortran myprogram.f95 -o myprogram
      $ flang program.f90 -o program
      C
      $ icc myprogram.c -o myprogram
      $ gcc myprogram.c -o myprogram
      $ clang program.c -o program
      C++
      $ icc myprogram.cpp -o myprogram
      $ g++ myprogram.cpp -o myprogram
      $ clang++ program.C -o program

      The Intel, GNU and AOCC compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95". You may use ".f90" to stand for any Fortran code regardless of version as it is a free-formatted form.

    Compiling MPI Programs

    OpenMPI and Intel MPI (IMPI) are implementations of the Message-Passing Interface (MPI) standard. Libraries for these MPI implementations and compilers for C, C++, and Fortran are available on Anvil.

    MPI programs require including a header file:
    Language Header Files
    Fortran 77
    INCLUDE 'mpif.h'
    Fortran 90
    INCLUDE 'mpif.h'
    Fortran 95
    INCLUDE 'mpif.h'
    C
    #include <mpi.h>
    C++
    #include <mpi.h>

    Here are a few sample programs using MPI:

    To see the available MPI libraries:

    $ module avail openmpi 
    $ module avail impi
    The following table illustrates how to compile your MPI program. Any compiler flags accepted by Intel ifort/icc compilers are compatible with their respective MPI compiler.
    Language Intel Compiler with Intel MPI (IMPI) GNU Compiler with OpenMPI/Intel MPI (IMPI) AOCC Compiler with OpenMPI/Intel MPI (IMPI)
    Fortran 77
    $ mpiifort program.f -o program
    $ mpif77 program.f -o program
    $ mpif77 program.f -o program 
    Fortran 90
    $ mpiifort program.f90 -o program
    $ mpif90 program.f90 -o program
    $ mpif90 program.f90 -o program
    Fortran 95
    $ mpiifort program.f90 -o program
    $ mpif90 program.f90 -o program
    $ mpif90 program.f90 -o program
    C
    $ mpiicc program.c -o program
    $ mpicc program.c -o program
    $ mpiclang program.c -o program
    C++
    $ mpiicpc program.C -o program
    $ mpiCC program.C -o program
    $ mpiclang program.C -o program

    The Intel, GNU and AOCC compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95". You may use ".f90" to stand for any Fortran code regardless of version as it is a free-formatted form.

    Here is some more documentation from other sources on the MPI libraries:

    Compiling OpenMP Programs

    All compilers installed on Anvil include OpenMP functionality for C, C++, and Fortran. An OpenMP program is a single process that takes advantage of a multi-core processor and its shared memory to achieve a form of parallel computing called multithreading. It distributes the work of a process over processor cores in a single compute node without the need for MPI communications.

    OpenMP programs require including a header file:
    Language Header Files
    Fortran 77
    INCLUDE 'omp_lib.h'
    Fortran 90
    use omp_lib
    Fortran 95
    use omp_lib
    C
    #include <omp.h>
    C++
    #include <omp.h>

    Sample programs illustrate task parallelism of OpenMP:

    A sample program illustrates loop-level (data) parallelism of OpenMP:

    To load a compiler, enter one of the following:

    $ module load intel
    $ module load gcc
    The following table illustrates how to compile your shared-memory program. Any compiler flags accepted by ifort/icc compilers are compatible with OpenMP.
    Language Intel Compiler GNU Compiler AOCC Compiler
    Fortran 77
    $ ifort -openmp myprogram.f -o myprogram
    $ gfortran -fopenmp myprogram.f -o myprogram
    $ ifort -fopenmp program.f -o program
    Fortran 90
    $ ifort -openmp myprogram.f90 -o myprogram
    $ gfortran -fopenmp myprogram.f90 -o myprogram
    $ ifort -fopenmp program.f90 -o program
    Fortran 95
    $ ifort -openmp myprogram.f90 -o myprogram
    $ gfortran -fopenmp myprogram.f90 -o myprogram
    $ ifort -fopenmp program.f90 -o program
    C
    $ icc -openmp myprogram.c -o myprogram
    $ gcc -fopenmp myprogram.c -o myprogram
    $ icc -fopenmp program.c -o program
    C++
    $ icc -openmp myprogram.cpp -o myprogram
    $ g++ -fopenmp myprogram.cpp -o myprogram
    $ icpc -fopenmp program.cpp -o program

    The Intel, GNU and AOCC compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95". You may use ".f90" to stand for any Fortran code regardless of version as it is a free-formatted form.

    Here is some more documentation from other sources on OpenMP:

    Compiling Hybrid Programs

    A hybrid program combines both MPI and shared-memory to take advantage of compute clusters with multi-core compute nodes. Libraries for OpenMPI and Intel MPI (IMPI) and compilers which include OpenMP for C, C++, and Fortran are available.

    Hybrid programs require including header files:
    Language Header Files
    Fortran 77
    INCLUDE 'omp_lib.h'
    INCLUDE 'mpif.h'
    
    Fortran 90
    use omp_lib
    INCLUDE 'mpif.h'
    
    Fortran 95
    use omp_lib
    INCLUDE 'mpif.h'
    
    C
    #include <mpi.h>
    #include <omp.h>
    
    C++
    #include <mpi.h>
    #include <omp.h>
    

    A few examples illustrate hybrid programs with task parallelism of OpenMP:

    This example illustrates a hybrid program with loop-level (data) parallelism of OpenMP:

    To see the available MPI libraries:

    $ module avail impi
    $ module avail openmpi

    The following tables illustrate how to compile your hybrid (MPI/OpenMP) program. Any compiler flags accepted by Intel ifort/icc compilers are compatible with their respective MPI compiler.

    Intel Compiler with Intel MPI(IMPI)
    Language Command
    Fortran 77
    $ mpiifort -openmp myprogram.f -o myprogram
    Fortran 90
    $ mpiifort -openmp myprogram.f90 -o myprogram
    Fortran 95
    $ mpiifort -openmp myprogram.f90 -o myprogram
    C
    $ mpiicc -openmp myprogram.c -o myprogram
    C++
    $ mpiicpc -openmp myprogram.C -o myprogram
    Intel Compiler with OpenMPI/Intel MPI(IMPI)
    Language Command
    Fortran 77
    $ mpif77 -openmp myprogram.f -o myprogram
    Fortran 90
    $ mpif90 -openmp myprogram.f90 -o myprogram
    Fortran 95
    $ mpif90 -openmp myprogram.f90 -o myprogram
    C
    $ mpicc -openmp myprogram.c -o myprogram
    C++
    $ mpiCC -openmp myprogram.C -o myprogram
    GNU/AOCC Compiler with OpenMPI/Intel MPI(IMPI)
    Language Command
    Fortran 77
    $ mpif77 -fopenmp myprogram.f -o myprogram
    Fortran 90
    $ mpif90 -fopenmp myprogram.f90 -o myprogram
    Fortran 95
    $ mpif90 -fopenmp myprogram.f90 -o myprogram
    C
    $ mpicc -fopenmp myprogram.c -o myprogram
    C++
    $ mpiCC -fopenmp myprogram.C -o myprogram

    The Intel, GNU and AOCC compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95". You may use ".f90" to stand for any Fortran code regardless of version as it is a free-formatted form.

    Compiling NVIDIA GPU Programs

    The Anvil cluster contains GPU nodes that support CUDA and OpenCL. See the detailed hardware overview for the specifics on the GPUs in Anvil. This section focuses on using CUDA.

    A simple CUDA program has a basic workflow:

    • Initialize an array on the host (CPU).
    • Copy array from host memory to GPU memory.
    • Apply an operation to array on GPU.
    • Copy array from GPU memory to host memory.

    Here is a sample CUDA program:

    ModuleTree or modtree helps users to navigate between CPU stack and GPU stack and sets up a default compiler and MPI environment. For Anvil cluster, our team makes a recommendation regarding the cuda version, compiler, and MPI library. This is a proven stable cuda, compiler, and MPI library combination that is recommended if you have no specific requirements. By load the recommended set:

    $ module load modtree/gpu
    $ module list
    # you will have all following modules
    Currently Loaded Modules:
      1) gcc/8.4.1   2) numactl/2.0.14   3) zlib/1.2.11   4) openmpi/4.0.6   5) cuda/11.2.2   6) modtree/gpu
    

    Both login and GPU-enabled compute nodes have the CUDA tools and libraries available to compile CUDA programs. For complex compilations, submit an interactive job to get to the GPU-enabled compute nodes. The gpu-debug queue is ideal for this case. To compile a CUDA program, load modtree/gpu, and use nvcc to compile the program:

    $ module load modtree/gpu
    $ nvcc gpu_hello.cu -o gpu_hello
    ./gpu_hello
    No GPU specified, using first GPUhello, world
    

    The example illustrates only how to copy an array between a CPU and its GPU but does not perform a serious computation.

    The following program times three square matrix multiplications on a CPU and on the global and shared memory of a GPU:

    $ module load modtree/gpu
    $ nvcc mm.cu -o mm
    $ ./mm 0
                                                                speedup
                                                                -------
    Elapsed time in CPU:                    7810.1 milliseconds
    Elapsed time in GPU (global memory):      19.8 milliseconds  393.9
    Elapsed time in GPU (shared memory):       9.2 milliseconds  846.8
    

    For best performance, the input array or matrix must be sufficiently large to overcome the overhead in copying the input and output data to and from the GPU.

    For more information about NVIDIA, CUDA, and GPUs:

    Provided Software

    The Anvil team provides a suite of broadly useful software for users of research computing resources. This suite of software includes compilers, debuggers, visualization libraries, development environments, and other commonly used software libraries. Additionally, some widely-used application software is provided.

    ModuleTree or modtree helps users to navigate between CPU stack and GPU stack and sets up a default compiler and MPI environment. For Anvil cluster, our team makes recommendations for both CPU and GPU stack regarding the CUDA version, compiler, math library, and MPI library. This is a proven stable CUDA version, compiler, math, and MPI library combinations that are recommended if you have no specific requirements. To load the recommended set:

    $ module load modtree/cpu # for CPU
    $ module load modtree/gpu # for GPU
    

    Link to section 'GCC Compiler' of 'Provided Software' GCC Compiler

    The GNU Compiler (GCC) is provided via the module command on Anvil clusters and will be maintained at a common version compatible across all clusters. Third-party software built with GCC will use this GCC version, rather than the GCC provided by the operating system vendor. To see available GCC compiler versions available from the module command:

    $ module avail gcc

    Link to section 'Toolchain' of 'Provided Software' Toolchain

    The Anvil team will build and maintain an integrated, tested, and supported toolchain of compilers, MPI libraries, data format libraries, and other common libraries. This toolchain will consist of:

    • Compiler suite (C, C++, Fortran) (Intel and GCC)
    • BLAS and LAPACK
    • MPI libraries (OpenMPI, MVAPICH, Intel MPI)
    • FFTW
    • HDF5
    • NetCDF

    Each of these software packages will be combined with the stable "modtree/cpu" compiler, the latest available Intel compiler, and the common GCC compiler. The goal of these toolchains is to provide a range of compatible compiler and library suites that can be selected to build a wide variety of applications. At the same time, the number of compiler and library combinations is limited to keep the selection easy to navigate and understand. Generally, the toolchain built with the latest Intel compiler will be updated at major releases of the compiler.

    Link to section 'Commonly Used Applications' of 'Provided Software' Commonly Used Applications

    The Anvil team will go to every effort to provide a broadly useful set of popular software packages for research cluster users. Software packages such as Matlab, Python (Anaconda), NAMD, GROMACS, R, and others that are useful to a wide range of cluster users are provided via the module command.

    Link to section 'Changes to Provided Software' of 'Provided Software' Changes to Provided Software

    Changes to available software, such as the introduction of new compilers and libraries or the retirement of older toolchains, will be scheduled in advance and coordinated with system maintenances. This is done to minimize impact and provide a predictable time for changes. Advance notice of changes will be given with regular maintenance announcements and through notices printed through “module load”s. Be sure to check maintenance announcements and job output for any upcoming changes.

    Link to section 'Long Term Support' of 'Provided Software' Long Term Support

    The Anvil team understands the need for a stable and unchanging suite of compilers and libraries. Research projects are often tied to specific compiler versions throughout their lifetime. The Anvil team will go to every effort to provide the "modtree/cpu" or "modtree/gpu" environment and the common GCC compiler as a long-term supported environment. These suites will stay unchanged for longer periods than the toolchain built with the latest available Intel compiler.

    Policies

    Here are details on some ITaP policies for research users and systems.

    Software Installation Request Policy

    The Anvil team will go to every effort to provide a broadly useful set of popular software packages for research cluster users. However, many domain-specific packages that may only be of use to single users or small groups of users are beyond the capacity of research computing staff to fully maintain and support. Please consider the following if you require software that is not available via the module command:

    • If your lab is the only user of a software package, Anvil staff may recommend that you install your software privately, either in your home directory or in your allocation project space. If you need help installing software, Anvil support team may be able to provide limited help.
    • As more users request a particular piece of software, Anvil may decide to provide the software centrally. Matlab, Python (Anaconda), NAMD, GROMACS, and R are all examples of frequently requested and used centrally-installed software.
    • Python modules that are available through the Anaconda distribution will be installed through it. Anvil staff may recommend you install other Python modules privately.

    If you're not sure how your software request should be handled or need help installing software please contact us at Help Desk.

    Helpful Tips

    We will strive to ensure that Anvil serves as a valuable resource to the national research community. We hope that you the user will assist us by making note of the following:

    • You share Anvil with thousands of other users, and what you do on the system affects others. Exercise good citizenship to ensure that your activity does not adversely impact the system and the research community with whom you share it. For instance: do not run jobs on the login nodes and do not stress the filesystem.
    • Help us serve you better by filing informative help desk tickets. Before submitting a help desk ticket do check what the user guide and other documentation say. Search the internet for key phrases in your error logs; that's probably what the consultants answering your ticket are going to do. What have you changed since the last time your job succeeded?
    • Describe your issue as precisely and completely as you can: what you did, what happened, verbatim error messages, other meaningful output. When appropriate, include the information a consultant would need to find your artifacts and understand your workflow: e.g. the directory containing your build and/or job script; the modules you were using; relevant job numbers; and recent changes in your workflow that could affect or explain the behavior you're observing.
    • Have realistic expectations. Consultants can address system issues and answer questions about Anvil. But they can't teach parallel programming in a ticket and may know nothing about the package you downloaded. They may offer general advice that will help you build, debug, optimize, or modify your code, but you shouldn't expect them to do these things for you.
    • Be patient. It may take a business day for a consultant to get back to you, especially if your issue is complex. It might take an exchange or two before you and the consultant are on the same page. If the admins disable your account, it's not punitive. When the file system is in danger of crashing, or a login node hangs, they don't have time to notify you before taking action.
    • Link to section ' Helpful Tools' of 'Helpful Tips' Helpful Tools

      The Anvil cluster provides a list of useful auxiliary tools:

      The following table provides a list of auxiliary tools:
      Tool Use
      myquota Check the quota of different file systems.
      flost A utility to recover files from snapshots.
      showpartitions Display all Slurm partitions and their current usage.
      myscratch Show the path to your scratch directory.
      jobinfo Collates job information from the sstat, sacctand squeue SLURM commands to give a uniform interface for both current and historical jobs.
      sfeatures Show the list of available constraint feature names for different node types.
    Helpful?

    Thanks for letting us know.

    Please don’t include any personal information in your comment. Maximum character limit is 250.
    Characters left: 250
    Thanks for your feedback.