Radon - Getting Started

Conventions Used in this Document

This document follows certain typesetting and naming conventions:

  • Colored, underlined text indicates a link.
  • Colored, bold text highlights something of particular importance.
  • Italicized text notes the first use of a key concept or term.
  • Bold, fixed-width font text indicates a command or command argument that you type verbatim.
  • Examples of commands and output as you would see them on the command line will appear in colored blocks of fixed-width text such as this:
    $ example
    This is an example of commands and output.
    
  • All command line shell prompts appear as a single dollar sign ("$"). Your actual shell prompt may differ.
  • All examples work with bash or ksh shells. Where different, changes needed for tcsh or csh shell users appear in example comments.
  • All names that begin with "my" illustrate examples that you replace with an appropriate name. These include "myusername", "myfilename", "mydirectory", "myjobid", etc.
  • The term "processor core" or "core" throughout this guide refers to the individual CPU cores on a processor chip.

Overview of Radon

Radon is a compute cluster operated by ITaP for general campus use. Radon consists of 24 64-bit, 8-core Dell 1950 systems with 16 GB RAM and 1 Gigabit Ethernet (1GigE) local to each node.

Detailed Hardware Specification

Radon consists of one logical sub-cluster "D". The nodes are 2.33 GHz quad-core Intel E5410 CPUs, 16 GB RAM, and 1 Gigabit Ethernet.

Sub-Cluster Number of Nodes Processors per Node Cores per Node Memory per Node Interconnect Theoretical Peak TeraFLOPS
Radon-D 30 Two 2.33 GHz Quad-Core Intel E5410 8 16 GB 1 GigE 58.2

Radon nodes run Red Hat Enterprise Linux 5 (RHEL5) and use Moab Workload Manager 7 and TORQUE Resource Manager 4 as the portable batch system (PBS) for resource and job management. Radon also runs jobs for BoilerGrid whenever processor cores in it would otherwise be idle. The application of operating system patches occurs as security needs dictate. All nodes allow for unlimited stack usage, as well as unlimited core dump size (though disk space and server quotas may still be a limiting factor).

For more information about the TORQUE Resource Manager:

On Radon, ITaP recommends the following set of compiler, math library, and message-passing library for parallel code:

  • Intel 13.1.1.163
  • MKL
  • OpenMPI 1.6.3

To load the recommended set:

$ module load devel

To verify what you loaded:

$ module list

Accounts on Radon

Obtaining an Account

All Purdue faculty, staff, and students with the approval of their advisor may request access to Radon. Refer to the Accounts / Access page for more details on how to request access.

Login / SSH

To submit jobs on Radon, log in to the submission host radon.rcac.purdue.edu via SSH. This submission host is actually 2 front-end hosts: radon-fe00 and radon-fe01. The login process randomly assigns one of these front-ends to each login to radon.rcac.purdue.edu. While all of these front-end hosts are identical, each has its own /tmp. Sharing data in /tmp during subsequent sessions may fail. ITaP advises using scratch storage for multisession, shared data instead.

SSH Client Software

Secure Shell or SSH is a way of establishing a secure (encrypted) connection between two computers. It uses public-key cryptography to authenticate the remote computer and (optionally) to allow the remote computer to authenticate the user. Its usual function involves logging in to a remote machine and executing commands, but it also supports tunneling and forwarding of X11 or arbitrary TCP connections. There are many SSH clients available for all operating systems.

Linux / Solaris / AIX / HP-UX / Unix:

  • The ssh command is pre-installed. Log in using ssh myusername@servername.

Microsoft Windows:

  • PuTTY is an extremely small download of a free, full-featured SSH client.
  • Secure CRT is a commercial SSH client which is freely available to Purdue students, faculty, and staff with a Purdue career account.

Mac OS X:

  • The ssh command is pre-installed. You may start a local terminal window from "Applications->Utilities". Log in using ssh myusername@servername.
  • MacSSH is another free SSH client.

SSH Keys

SSH works with many different means of authentication. One popular authentication method is Public Key Authentication (PKA). PKA is a method of establishing your identity to a remote computer using related sets of encryption data called keys. PKA is a more secure alternative to traditional password-based authentication with which you are probably familiar.

To employ PKA via SSH, you manually generate a keypair (also called SSH keys) in the location from where you wish to initiate a connection to a remote machine. This keypair consists of two text files: private key and public key. You keep the private key file confidential on your local machine or local home directory (hence the name "private" key). You then log in to a remote machine (if possible) and append the corresponding public key text to the end of a specific file, or have a system administrator do so on your behalf. In future login attempts, PKA compares the public and private keys to verify your identity; only then do you have access to the remote machine.

As a user, you can create, maintain, and employ as many keypairs as you wish. If you connect to a computational resource from your work laptop, your work desktop, and your home desktop, you can create and employ keypairs on each. You can also create multiple keypairs on a single local machine to serve different purposes, such as establishing access to different remote machines or establishing different types of access to a single remote machine. In short, PKA via SSH offers a secure but flexible means of identifying yourself to all kinds of computational resources.

Passphrases and SSH Keys

Creating a keypair prompts you to provide a passphrase for the private key. This passphrase is different from a password in a number of ways. First, a passphrase is, as the name implies, a phrase. It can include most types of characters, including spaces, and has no limits on length. Secondly, the remote machine does not receive this passphrase for verification. Its purpose is only to allow the use of your local private key and is specific to a specific local private key.

Perhaps you are wondering why you would need a private key passphrase at all when using PKA. If the private key remains secure, why the need for a passphrase just to use it? Indeed, if the location of your private keys were always completely secure, a passphrase might not be necessary. In reality, a number of situations could arise in which someone may improperly gain access to your private key files. In these situations, a passphrase offers another level of security for you, the user who created the keypair.

Think of the private key/passphrase combination as being analogous to your ATM card/PIN combination. The ATM card itself is the object that grants access to your important accounts, and as such, should remain secure at all times—just as a private key should. But if you ever lose your wallet or someone steals your ATM card, you are glad that your PIN exists to offer another level of protection. The same is true for a private key passphrase.

When you create a keypair, you should always provide a corresponding private key passphrase. For security purposes, avoid using phrases which automated programs can discover (e.g. phrases that consist solely of words in English-language dictionaries). This passphrase is not recoverable if forgotten, so make note of it. Only a few situations warrant using a non-passphrase-protected private key—conducting automated file backups is one such situation. If you need to use a non-passphrase-protected private key to conduct automated backups to Fortress, see the No-Passphrase SSH Keys section.

Passwords

If you have received a default password as part of the process of obtaining your account, you should change it before you log onto Radon for the first time. Change your password from the SecurePurdue website. You will have the same password on all ITaP systems such as Radon, Purdue email, or Blackboard.

Passwords may need to be changed periodically in accordance with Purdue security policies. Passwords must follow certain guidelines as described on the SecurePurdue webpage and ITaP recommends following some guidelines to select a strong password.

ITaP staff will NEVER ask for your password, by email or otherwise.

Never share your password with another user or make your password known to anyone else.

File Storage and Transfer for Radon

Storage Options

File storage options on ITaP research systems include long-term storage (home directories, Fortress) and short-term storage (scratch directories, /tmp directory). Each option has different performance and intended uses, and some options vary from system to system as well. ITaP provides daily snapshots of home directories for a limited time for accidental deletion recovery. ITaP does not back up scratch directories or temporary storage and regularly purges old files from scratch and /tmp directories. More details about each storage option appear below.

Home Directories

ITaP provides home directories for long-term file storage. Each user ID has one home directory. You should use your home directory for storing important program files, scripts, input data sets, critical results, and frequently used files. You should store infrequently used files on Fortress. Your home directory becomes your current working directory, by default, when you log in.

ITaP provides daily snapshots of your home directory for a limited period of time in the event of accidental deletion. For additional security, you should store another copy of your files on more permanent storage, such as the Fortress HPSS Archive.

Your home directory physically resides within the Isilon storage system at Purdue. To find the path to your home directory, first log in then immediately enter the following:

$ pwd
/home/myusername

Or from any subdirectory:

$ echo $HOME
/home/myusername

Your home directory and its contents are available on all ITaP research front-end hosts and compute nodes via the Network File System (NFS).

Your home directory has a quota capping the size and/or number of files you may store within. For more information, refer to the Storage Quotas / Limits Section.

Lost Home Directory File Recovery

Only files which have been snap-shotted overnight are recoverable. If you lose a file the same day you created it, it is NOT recoverable.

To recover files lost from your home directory, use the flost command:

$ flost

Scratch Directories

ITaP provides scratch directories for short-term file storage only. The quota of your scratch directory is much greater than the quota of your home directory. You should use your scratch directory for storing large temporary input files which your job reads or for writing large temporary output files which you may examine after execution of your job. You should use your home directory and Fortress for longer-term storage or for holding critical results.

Files in scratch directories are not recoverable. ITaP does not back up files in scratch directories. If you accidentally delete a file, a disk crashes, or old files are purged, they cannot be restored.

ITaP automatically removes (purges) from scratch directories all files stored for more than 90 days. Owners of these files receive a notice one week before removal via email. For more information, please refer to our Scratch File Purging Policy.

All users may access scratch directories on Radon. To find the path to your scratch directory:

$ findscratch
/scratch/scratch95/m/myusername

The value of variable $RCAC_SCRATCH is your scratch directory path. Use this variable in any scripts. Your actual scratch directory path may change without warning, but this variable will remain current.

$ echo $RCAC_SCRATCH
/scratch/scratch95/m/myusername

All scratch directories are available on each front-end of all computational resources, however, only the /scratch/scratch95 directory is available on Radon compute nodes. No other scratch directories are available on Radon compute nodes.

To find the path to someone else's scratch directory:

$ findscratch someusername
/scratch/scratch95/s/someusername

Your scratch directory has a quota capping the total size and number of files you may store in it. For more information, refer to the section Storage Quotas / Limits .

/tmp Directory

ITaP provides /tmp directories for short-term file storage only. Each front-end and compute node has a /tmp directory. Your program may write temporary data to the /tmp directory of the compute node on which it is running. That data is available for as long as your program is active. Once your program terminates, that temporary data is no longer available. When used properly, /tmp may provide faster local storage to an active process than any other storage option. You should use your home directory and Fortress for longer-term storage or for holding critical results.

ITaP does not perform backups for the /tmp directory and removes files from /tmp whenever space is low or whenever the system needs a reboot. In the event of a disk crash or file purge, files in /tmp are not recoverable. You should copy any important files to more permanent storage.

Long-Term Storage

Long-term Storage or Permanent Storage is available to ITaP research users on the High Performance Storage System (HPSS), an archival storage system, commonly referred to as "Fortress". HPSS is a software package that manages a hierarchical storage system. Program files, data files and any other files which are not used often, but which must be saved, can be put in permanent storage. Fortress currently has a 1.2 PB capacity.

Files smaller than 100 MB have their primary copy stored on low-cost disks (disk cache), but the second copy (backup of disk cache) is on tape or optical disks. This provides a rapid restore time to the disk cache. However, the large latency to access a larger file (usually involving a copy from a tape cartridge) makes it unsuitable for direct use by any processes or jobs, even where possible. The primary and secondary copies of larger files are stored on separate tape cartridges in the Quantum (ADIC, Advanced Digital Information Corporation) tape library.

To ensure optimal performance for all users, and to keep the Fortress system healthy, please remember the following tips:

  • Fortress operates most effectively with large files - 1GB or larger. If your data is comprised of smaller files, use HTAR to directly create archives in Fortress.
  • When working with files on cluster head nodes, use your home directory or a scratch file system, rather than editing or computing on files directly in Fortress. Copy any data you wish to archive to Fortress after computation is complete.
  • The HPSS software does not handle sparse files (files with empty space) in an optimal manner. Therefore, if you must copy a sparse file into HPSS, use HSI rather than the cp or mv commands.
  • Due to the sparse files issue, the rsync command should not be used to copy data into Fortress through NFS, as this may cause problems with the system.

Fortress writes two copies of every file either to two tapes, or to disk and a tape, to protect against medium errors. Unfortunately, Fortress does not automatically switch to the alternate copy when it has trouble accessing the primary. If it seems to be taking an extraordinary amount of time to retrieve a file (hours), please either email rcac-help@purdue.edu or call ITaP Customer Service at 765-49-4400. We can then investigate why it is taking so long. If it is an error on the primary copy, we will instruct Fortress to switch to the alternate copy as the primary and recreate a new alternate copy.

For more information about Fortress, how it works, user guides, and how to obtain an account:

Environment Variables

Several environment variables are automatically defined for you to help you manage your storage. Use environment variables instead of actual paths whenever possible to avoid problems if the specific paths to any of these change. Some of the environment variables you should have are:

Name Description
HOME path to your home directory
PWD path to your current directory
RCAC_SCRATCH path to scratch filesystem

By convention, environment variable names are all uppercase. You may use them on the command line or in any scripts in place of and in combination with hard-coded values:

$ ls $HOME
...

$ ls $RCAC_SCRATCH/myproject
...

To find the value of any environment variable:

$ echo $RCAC_SCRATCH
/scratch/scratch95/m/myusername

To list the values of all environment variables:

$ env
USER=myusername
HOME=/home/myusername
RCAC_SCRATCH=/scratch/scratch95/m/myusername
...

You may create or overwrite an environment variable. To pass (export) the value of a variable in either bash or ksh:

$ export MYPROJECT=$RCAC_SCRATCH/myproject

To assign a value to an environment variable in either tcsh or csh:

$ setenv MYPROJECT value

Storage Quotas / Limits

ITaP imposes some limits on your disk usage on research systems. ITaP implements a quota on each filesystem. Each filesystem (home directory, scratch directory, etc.) may have a different limit. If you exceed the quota, you will not be able to save new files or new data to the filesystem until you delete or move data to long-term storage.

Checking Quota Usage

To check the current quotas of your home and scratch directories use the myquota command:

$ myquota
Type        Filesystem          Size    Limit  Use         Files    Limit  Use
==============================================================================
home        extensible         5.0GB   10.0GB  50%             -        -   - 
scratch     /scratch/scratch95/    8KB  476.8GB   0%             2  100,000   0%

The columns are as follows:

  1. Type: indicates home or scratch directory.
  2. Filesystem: name of storage option.
  3. Size: sum of file sizes in bytes.
  4. Limit: allowed maximum on sum of file sizes in bytes.
  5. Use: percentage of file-size limit currently in use.
  6. Files: number of files and directories (not the size).
  7. Limit: allowed maximum on number of files and directories. It is possible, though unlikely, to reach this limit and not the file-size limit if you create a large number of very small files.
  8. Use: percentage of file-number limit currently in use.

If you find that you reached your quota in either your home directory or your scratch file directory, obtain estimates of your disk usage. Find the top-level directories which have a high disk usage, then study the subdirectories to discover where the heaviest usage lies.

To see in a human-readable format an estimate of the disk usage of your top-level directories in your home directory:

$ du -h --max-depth=1 $HOME >myfile
32K /home/myusername/mysubdirectory_1
529M    /home/myusername/mysubdirectory_2
608K    /home/myusername/mysubdirectory_3

The second directory is the largest of the three, so apply command du to it.

To see in a human-readable format an estimate of the disk usage of your top-level directories in your scratch file directory:

$ du -h --max-depth=1 $RCAC_SCRATCH >myfile
160K    /scratch/scratch95/m/myusername

This strategy can be very helpful in figuring out the location of your largest usage. Move unneeded files and directories to long-term storage to free space in your home and scratch directories.

Increasing Your Storage Quota

Home Directory

If you find you need additional disk space in your home directory, please first consider archiving and compressing old files and moving them to long-term storage on the Fortress HPSS Archive. If you are unable to do so, you may go to the BoilerBackpack Quota Management site and use the sliders there to increase the amount of space allocated to your research home directory vs. other storage options, up to a maximum of 100GB.

Scratch Directory

If you find you need additional disk space in your scratch directory, please first consider archiving and compressing old files and moving them to long-term storage on the Fortress HPSS Archive. If you are unable to do so, you may ask for a quota increase at rcac-help@purdue.edu. Quota requests up to 2TB and 200,000 files on LustreA or LustreC can be processed quickly.

Archive and Compression

There are several options for archiving and compressing groups of files or directories on ITaP research systems. The mostly commonly used options are:

  • tar   (more information)
    Saves many files together into a single archive file, and restores individual files from the archive. Includes automatic archive compression/decompression options and special features for incremental and full backups.
    Examples:
      (list contents of archive somefile.tar)
    $ tar tvf somefile.tar
    
      (extract contents of somefile.tar)
    $ tar xvf somefile.tar
    
      (extract contents of gzipped archive somefile.tar.gz)
    $ tar xzvf somefile.tar.gz
    
      (extract contents of bzip2 archive somefile.tar.bz2)
    $ tar xjvf somefile.tar.bz2
    
      (archive all ".c" files in current directory into one archive file)
    $ tar cvf somefile.tar *.c 
    
      (archive and gzip-compress all files in a directory into one archive file)
    $ tar czvf somefile.tar.gz somedirectory/
    
      (archive and bzip2-compress all files in a directory into one archive file)
    $ tar cjvf somefile.tar.bz2 somedirectory/
    
    
    Other arguments for tar can be explored by using the man tar command.
  • gzip   (more information)
    The standard compression system for all GNU software.
    Examples:
      (compress file somefile - also removes uncompressed file)
    $ gzip somefile
    
      (uncompress file somefile.gz - also removes compressed file)
    $ gunzip somefile.gz
    
  • bzip2   (more information)
    Strong, lossless data compressor based on the Burrows-Wheeler transform. Stronger compression than gzip.
    Examples:
      (compress file somefile - also removes uncompressed file)
    $ bzip2 somefile
    
      (uncompress file somefile.bz2 - also removes compressed file)
    $ bunzip2 somefile.bz2
    

There are several other, less commonly used, options available as well:

  • zip
  • 7zip
  • xz

File Transfer

There are a variety of ways to transfer data to and from ITaP research systems. Which you should use depends on several factors, including the ease of use for you personally, connection speed and bandwidth, the size and number of files to be transferred. For more details on file transfer methods and applications, refer to the Radon Complete User Guide.

Applications on Radon

Provided Applications

The following table lists the third-party software which ITaP has installed on its research systems. Additional software may be available. To see the software on a specific system, run the command module avail on that system. Please contact rcac-help@purdue.edu if you are interested in the availability of software not shown in this list.

Software Radon Steele Coates, Rossmann, Hansen & Carter Peregrine 1
Abaqus ¹
AcGrace
Amber ¹
Ann
ANSYS ¹
ATK
Antelope
Auto3Dem
ATLAS
BinUtils
BLAST
Boost
Cairo
CDAT
CGNSLib
Cmake
COMSOL ²
CPLEX ¹
DX
Eman
Eman2
Ferret
FFMPEG
FFTW
FLUENT ¹
GAMESS
GAMS
Gaussian ¹
GCC (Compilers)
GDAL
GemPak
Git
GLib
GMP
GMT
GrADS
GROMACS
GS
GSL
GTK+
GTKGlarea
Guile
HarminV
HDF4
HDF5
Hy3S
ImageMagick
IMSL ¹
Intel Compilers ¹
Jackal ²
Jasper
Java
LAMMPS
LibCTL
LibPNG
LibTool
LoopyMod ²
Maple ¹
Mathematica ¹
MATLAB ¹
Meep
MoPac
MPB
MPFR
MPICH
MPICH2
MPIExec
MrBayes
MUMPS
MVAPICH2
NAMD
NCL
NCO
NCView
NetCDF
NETPBM
NWChem
Octave
OpenMPI
Pango
Petsc
PGI Compilers ¹
Phrap
Pixman
PKG-Config
Proj
Python
QTLC
Rational
R
SAC
SAS ¹
ScaLAPACK
Seismic
Subversion
SWFTools
Swig
SysTools
Tao
TecPlot ²
TotalView ¹
UDUNITS
Valgrind
VMD
Weka

¹ Only users on Purdue's West Lafayette campus may use this software.
² Only specific research groups may use this software.

Please contact rcac-help@purdue.edu for specific questions about software license restrictions on ITaP research systems.

Environment Management with the Module Command

ITaP uses the module command as the preferred method to manage your processing environment. With this command, you may load applications and compilers along with their libraries and paths. Modules are packages which you load and unload as needed.

Please use the module command and do not manually configure your environment, as ITaP staff will frequently make changes to the specifics of various packages. If you use the module command to manage your environment, these changes will not be noticeable.

To view a brief usage report:

$ module

Below follows a short introduction to the module command. You can see more in the man page for module.

List Available Modules

To see what modules are available on this system:

$ module avail

To see which versions of a specific compiler are available on this system:

$ module avail gcc
$ module avail intel
$ module avail pgi

To see available modules for MPI libraries:

 $ module avail openmpi 
 $ module avail impi 
 $ module avail mpich2 

To see available modules for specific provided applications, use names from the list obtained with the command module avail:

$ module avail abaqus
$ module avail matlab
$ module avail mathematica

Load / Unload a Module

All modules consist of both a name and a version number. When loading a module, you may use only the name to load the default version, or you may specify which version you wish to load.

For each cluster, ITaP makes a recommendation regarding the set of compiler, math library, and message-passing library for parallel code. To load the recommended set:

$ module load devel

To verify what you loaded:

$ module list

To load the default version of a specific compiler, choose one of the following commands:

$ module load gcc
$ module load intel
$ module load pgi

To load a specific version of the Intel compiler, include the version number:

$ module load intel/11.1.072

When running a job, you must use the job submission file to load on the compute node(s) any relevant modules. Loading modules on the front end before submitting your job is sufficient when using the front end during the development phase of your application but not sufficient when using the compute node(s) during the production phase. You must load the same modules on the compute node(s).

To unload a module, enter the same module name used to load that module. Unloading will attempt to undo the environmental changes which a previous load command installed.

To unload the default version of a specific compiler:

$ module unload gcc
$ module unload intel
$ module unload pgi

To unload a specific version of the Intel compiler, include the same version number used to load that Intel compiler:

$ module unload intel/11.1.072

Apply the same methods to manage the modules of provided applications:

$ module load matlab
$ module unload matlab

To unload all currently loaded modules:

module purge

List Currently Loaded Modules

To see currently loaded modules:

$ module list
Currently Loaded Modulefiles:
  1) intel/12.1

To unload a module:

$ module unload intel
$ module list
No Modulefiles Currently Loaded.

Compiling Source Code on Radon

Provided Compilers

Compilers are available on Radon for Fortran 77, Fortran 90, Fortran 95, C, and C++. The compilers can produce general-purpose and architecture-specific optimizations to improve performance. These include loop-level optimizations, inter-procedural analysis and cache optimizations. The compilers support automatic and user-directed parallelization of Fortran, C, and C++ applications for multiprocessing execution. More detailed documentation on each compiler set available on Radon follows.

On Radon, ITaP recommends the following set of compiler, math library, and message-passing library for parallel code:

  • Intel 13.1.1.163
  • MKL
  • OpenMPI 1.6.3

To load the recommended set:

$ module load devel
$ module list

Intel Compiler Set

One or more versions of the Intel compiler set (compilers and associated libraries) are available on Radon. To discover which ones:

$ module avail intel

Choose an appropriate Intel module and load it. For example:

module load intel

Here are some examples for the Intel compilers:

Language Serial Program MPI Program OpenMP Program
Fortran77
$ ifort myprogram.f -o myprogram
$ mpiifort myprogram.f -o myprogram
$ ifort -openmp myprogram.f -o myprogram
Fortran90
$ ifort myprogram.f90 -o myprogram
$ mpiifort myprogram.f90 -o myprogram
$ ifort -openmp myprogram.f90 -o myprogram
Fortran95 (same as Fortran 90) (same as Fortran 90) (same as Fortran 90)
C
$ icc myprogram.c -o myprogram
$ mpiicc myprogram.c -o myprogram
$ icc -openmp myprogram.c -o myprogram
C++
$ icpc myprogram.cpp -o myprogram
$ mpiicpc myprogram.cpp -o myprogram
$ icpc -openmp myprogram.cpp -o myprogram

More information on compiler options appear in the official man pages, which are accessible with the man command after loading the appropriate compiler module, or online here:

For more documentation on the Intel compilers:

GNU Compiler Set

The official name of the GNU compilers is "GNU Compiler Collection" or "GCC". One or more versions of the GNU compiler set (compilers and associated libraries) are available on Radon. To discover which ones:

$ module avail gcc

Choose an appropriate GCC module and load it. For example:

module load gcc

An older version of the GNU compiler will be in your path by default. Do NOT use this version. Instead, load a newer version using the command module load gcc.

Here are some examples for the GNU compilers:

Language Serial Program MPI Program OpenMP Program
Fortran77
$ gfortran myprogram.f -o myprogram
$ mpif77 myprogram.f -o myprogram
$ gfortran -fopenmp myprogram.f -o myprogram
Fortran90
$ gfortran myprogram.f90 -o myprogram
$ mpif90 myprogram.f90 -o myprogram
$ gfortran -fopenmp myprogram.f90 -o myprogram
Fortran95
$ gfortran myprogram.f95 -o myprogram
$ mpif90 myprogram.f95 -o myprogram
$ gfortran -fopenmp myprogram.f95 -o myprogram
C
$ gcc myprogram.c -o myprogram
$ mpicc myprogram.c -o myprogram
$ gcc -fopenmp myprogram.c -o myprogram
C++
$ g++ myprogram.cpp -o myprogram
$ mpiCC myprogram.cpp -o myprogram
$ g++ -fopenmp myprogram.cpp -o myprogram

More information on compiler options appear in the official man pages, which are accessible with the man command after loading the appropriate compiler module, or online here:

For more documentation on the GCC compilers:

PGI Compiler Set

One or more versions of the PGI compiler set (compilers and associated libraries) are available on Radon. To discover which ones:

$ module avail pgi

Choose an appropriate PGI module and load it. For example:

module load pgi

Here are some examples for the PGI compilers:

Language Serial Program MPI Program OpenMP Program
Fortran77
$ pgf77 myprogram.f -o myprogram
$ mpif77 myprogram.f -o myprogram
$ pgf77 -mp myprogram.f -o myprogram
Fortran90
$ pgf90 myprogram.f90 -o myprogram
$ mpif90 myprogram.f90 -o myprogram
$ pgf90 -mp myprogram.f90 -o myprogram
Fortran95
$ pgf95 myprogram.f95 -o myprogram
$ mpif90 myprogram.f95 -o myprogram
$ pgf95 -mp myprogram.f95 -o myprogram
C
$ pgcc myprogram.c -o myprogram
$ mpicc myprogram.c -o myprogram
$ pgcc -mp myprogram.c -o myprogram
C++
$ pgCC myprogram.cpp -o myprogram
$ mpiCC myprogram.cpp -o myprogram
$ pgCC -mp myprogram.cpp -o myprogram

More information on compiler options can be found in the official man pages, which are accessible with the man command after loading the appropriate compiler module, or online here:

For more documentation on the PGI compilers:

Running Jobs on Radon

There are two methods for submitting jobs to the Radon community cluster. First, you may use the portable batch system (PBS) to submit jobs directly to a queue on Radon. PBS performs job scheduling. Jobs may be serial, message-passing, shared-memory, or hybrid (message-passing + shared-memory) programs. You may use either the batch or interactive mode to run your jobs. Use the batch mode for finished programs; use the interactive mode only for debugging. Secondly, since the Radon cluster is a part of BoilerGrid, you may submit serial jobs to BoilerGrid and specifically request compute nodes on Radon.

Running Jobs via PBS

The Portable Batch System (PBS) is a richly featured workload management system providing job scheduling and job management interface on computing resources, including Linux clusters. With PBS, a user requests resources and submits a job to a queue. The system will then take jobs from queues, allocate the necessary nodes, and execute them in as efficient a manner as it can.

Do NOT run large, long, multi-threaded, parallel, or CPU-intensive jobs on a front-end login host. All users share the front-end hosts, and running anything but the smallest test job will negatively impact everyone's ability to use Radon. Always use PBS to submit your work as a job. You may even submit interactive sessions as jobs. This section of documentation will explain how to use PBS.

Tips

  • Remember that ppn can not be larger than the number of processor cores on each node.
  • If you compiled your own code, you must module load that same compiler from your job submission file. However, it is not necessary to load the standard compiler module if you load the corresponding compiler module with parallel libraries included.
  • To see a list of the nodes which ran your job: cat $PBS_NODEFILE
  • The order of processor cores is random. There is no way to tell which processor will do what or in which order in a parallel program.
  • If you use the tcsh and csh shells and if a .logout file exists in your home directory, the exit status of your jobs will be that of the .logout script, not the job submission file. This may impact any interjob dependencies. To preserve the job exit status, remove the .logout file.

Queues

Radon has only one queue, the "workq" queue, and it is open to all users of the system.

Job Submission File

To submit work to a PBS queue, you must first create a job submission file. This job submission file is essentially a simple shell script. It will set any required environment variables, load any necessary modules, create or modify files and directories in your scratch space, and invoke any applications that you need. However, a job submission file can be as simple as the path to your application:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

# Print the hostname of the compute node on which this job is running.
/bin/hostname

Or, as simple as listing the names of compute nodes assigned to your job:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

# PBS_NODEFILE contains the names of assigned compute nodes.
cat $PBS_NODEFILE

PBS sets several potentially useful environment variables which you may use within your job submission files. Here is a list of some:

Name Description
PBS_O_WORKDIR Absolute path of the current working directory when you submitted this job
PBS_JOBID Job ID number assigned to this job by the batch system
PBS_JOBNAME Job name supplied by the user
PBS_NODEFILE File containing the list of nodes assigned to this job
PBS_O_HOST Hostname of the system where you submitted this job
PBS_O_QUEUE Name of the original queue to which you submitted this job
PBS_O_SYSTEM Operating system name given by uname -s where you submitted this job
PBS_ENVIRONMENT "PBS_BATCH" if this job is a batch job, or "PBS_INTERACTIVE" if this job is an interactive job

Here is an example of a commonly used PBS variable, making sure a job runs from within the same directory that you submitted it from:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

# Change to the directory from which you originally submitted this job.
cd $PBS_O_WORKDIR

# Print out the current working directory path.
pwd

You may also find the need to load a module to run a job on a compute node. Loading a module on a front end does NOT automatically load that module on the compute node where a job runs. You must use the job submission file to load a module on the compute node:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

# Load the module for NetPBM.
module load netpbm

# Convert a PostScript file to GIF format using NetPBM tools.
pstopnm myfilename.ps | ppmtogif > myfilename.gif

Job Submission

Once you have a job submission file, you may submit this script to PBS using the qsub command. PBS will find an available processor core or a set of processor cores and run your job there, or leave your job in a queue until some become available. At submission time, you may also optionally specify many other attributes or job requirements you have regarding where your jobs will run.

To submit your serial job to one processor core on one compute node with no special requirements:

$ qsub myjobsubmissionfile

To submit your job to a specific queue:

$ qsub -q myqueuename myjobsubmissionfile

By default, each job receives 30 minutes of wall time for its execution. The wall time is the total time in real clock time (not CPU cycles) that you believe your job will need to run to completion. If you know that your job will not need more than a certain amount of time to run, it is very much to your advantage to request less than the maximum allowable wall time, as this may allow your job to schedule and run sooner. To request the specific wall time of 1 hour and 30 minutes:

$ qsub -l walltime=01:30:00 myjobsubmissionfile

To submit your job with your currently-set environment variables:

$ qsub -V myjobsubmissionfile

The nodes resource indicates how many compute nodes you would like reserved for your job. The node property ppn specifies how many processor cores you need on each compute node. Each compute node in Radon has 8 processor cores. Detailed explanations regarding the distribution of your job across different compute nodes for parallel programs appear in the sections covering specific parallel programming libraries.

To request 2 compute nodes with 4 processor cores per node:

$ qsub -l nodes=2:ppn=4 myjobsubmissionfile

Here is a typical list of compute node names from a qsub command requesting 2 compute nodes and 4 processor cores:

radon-a639
radon-a639
radon-a639
radon-a639
radon-a638
radon-a638
radon-a638
radon-a638

Note that if you request more than ppn=8 on Radon, your job will never run, because Radon compute nodes only have 8 processor cores each.

Normally, compute nodes running your job may also be running jobs from other users. ITaP research systems have many processor cores in each compute node, so node sharing allows more efficient use of the system. However, if you have special needs that prohibit others from effectively sharing a compute node with your job, such as needing all of the memory on a compute node, you may request exclusive access to any compute nodes allocated to your job.

To request exclusive access to a compute node, set ppn to the maximum number of processor cores physically available on a compute node:

$ qsub -l nodes=1:ppn=8 myjobsubmissionfile

If more convenient, you may also specify any command line options to qsub from within your job submission file, using a special form of comment:

#!/bin/sh -l
# FILENAME:  myjobsubmissionfile

#PBS -V
#PBS -q myqueuename
#PBS -l nodes=1:ppn=8
#PBS -l walltime=01:30:00
#PBS -N myjobname

# Print the hostname of the compute node on which this job is running.
/bin/hostname

If an option is present in both your job submission file and on the command line, the option on the command line will take precedence.

After you submit your job with qsub, it can reside in a queue for minutes, hours, or even weeks. How long it takes for a job to start depends on the specific queue, the number of compute nodes requested, the amount of wall time requested, and what other jobs already waiting in that queue requested as well. It is impossible to say for sure when any given job will start. For best results, request no more resources than your job requires.

PBS catches only output written to standard output and standard error. Standard output (output normally sent to the screen) will appear in your directory in a file whose extension begins with the letter "o", for example myjobsubmissionfile.o1234, where "1234" represents the PBS job ID. Errors that occurred during the job run and written to standard error (output also normally sent to the screen) will appear in your directory in a file whose extension begins with the letter "e", for example myjobsubmissionfile.e1234. Often, the error file is empty. If your job wrote results to a file, those results will appear in that file.

Parallel applications may require special care in the selection of PBS resources. Please refer to the sections that follow for details on how to run parallel applications with various parallel libraries.

Job Status

The command qstat -a will list all jobs currently queued or running and some information about each:

$ qstat -a

radon-adm.rcac.purdue.edu:
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
107025.radon    user123  workq    hello         --    1   8    --  00:05 Q   --
115505.radon    user456  ncn      job4         5601   1   1    --  600:0 R 575:0
...
189479.radon    user456  workq    AR4b          --    5  40    --  04:00 H   --
189481.radon    user789  workq    STDIN        1415   1   1    --  00:30 R 00:07
189483.radon    user789  workq    STDIN        1758   1   1    --  00:30 R 00:07
189484.radon    user456  workq    AR4b          --    5  40    --  04:00 H   --
189485.radon    user456  workq    AR4b          --    5  40    --  04:00 Q   --
189486.radon    user123  tg_workq STDIN         --    1   1    --  12:00 Q   --
189490.radon    user456  workq    job7        26655   1   8    --  04:00 R 00:06
189491.radon    user123  workq    job11         --    1   8    --  04:00 Q   --

The status of each job listed appears in the "S" column toward the right. Possible status codes are: "Q" = Queued, "R" = Running, "C" = Completion, and "H" = Held.

To see only your own jobs, use the -u option to qstat and specify your own username:

$ qstat -a -u myusername

radon-adm.rcac.purdue.edu:
                                                              Req'd  Req'd   Elap
Job ID          Username   Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- ---------- -------- ---------- ------ --- --- ------ ----- - -----
182792.radon    myusername workq    job1        28422   1   4    --  23:00 R 20:19
185841.radon    myusername workq    job2        24445   1   4    --  23:00 R 20:19
185844.radon    myusername workq    job3        12999   1   4    --  23:00 R 20:18
185847.radon    myusername workq    job4        13151   1   4    --  23:00 R 20:18

To retrieve useful information about your queued or running job, use the checkjob command with your job's ID number. The output should look similar to the following:

$ checkjob -v 163000

job 163000 (RM job '163000.radon-adm.rcac.purdue.edu')

AName: test
State: Idle 
Creds:  user:myusername  group:mygroup  class:myqueue
WallTime:   00:00:00 of 20:00:00
SubmitTime: Wed Apr 18 09:08:37
  (Time Queued  Total: 1:24:36  Eligible: 00:00:23)

NodeMatchPolicy: EXACTNODE
Total Requested Tasks: 2
Total Requested Nodes: 1

Req[0]  TaskCount: 2  Partition: ALL  
TasksPerNode: 2  NodeCount:  1


Notification Events: JobFail

IWD:            /home/myusername/gaussian
UMask:          0000 
OutputFile:     radon-fe00.rcac.purdue.edu:/home/myusername/gaussian/test.o163000
ErrorFile:      radon-fe00.rcac.purdue.edu:/home/myusername/gaussian/test.e163000
User Specified Partition List:   radon-adm,SHARED
Partition List: radon-adm
SrcRM:          radon-adm  DstRM: radon-adm  DstRMJID: 163000.radon-adm.rcac.purdue.edu
Submit Args:    -l nodes=1:ppn=2,walltime=20:00:00 -q myqueue
Flags:          RESTARTABLE
Attr:           checkpoint
StartPriority:  1000
PE:             2.00
NOTE:  job violates constraints for partition radon-adm (job 163000 violates active HARD MAXPROC limit of 160 for class myqueue  partition ALL (Req: 2  InUse: 160))

BLOCK MSG: job 163000 violates active HARD MAXPROC limit of 160 for class myqueue  partition ALL (Req: 2  InUse: 160) (recorded at last scheduling iteration)

There are several useful bits of information in this output.

  • State lets you know if the job is Idle, Running, Completed, or Held.
  • WallTime will show how long the job has run and its maximum time.
  • SubmitTime is when the job was submitted to the cluster.
  • Total Requested Tasks is the total number of cores used for the job.
  • Total Requested Nodes and NodeCount are the number of nodes used for the job.
  • TasksPerNode is the number of cores used per node.
  • IWD is the job's working directory.
  • OutputFile and ErrorFile are the locations of stdout and stderr of the job, respectively.
  • Submit Args will show the arguments given to the qsub command.
  • NOTE/BLOCK MSG will show details on why the job isn't running. The above error says that all the cores are in use on that queue and the job has to wait. Other errors may give insight as to why the job fails to start or is held.

To view the output of a running job, use the qpeek command with your job's ID number. The -f option will continually output to the screen similar to tail -f, while qpeek without options will just output the whole file so far. Here is an example output from an application:

$ qpeek -f 1651025
TIMING: 600  CPU: 97.0045, 0.0926592/step  Wall: 97.0045, 0.0926592/step, 0.11325 hours remaining, 809.902344 MB of memory in use.
ENERGY:     600    359272.8746    280667.4810     81932.7038      5055.7519       -4509043.9946    383233.0971         0.0000         0.0000    947701.9550       -2451180.1312       298.0766  -3398882.0862  -2442581.9707       298.2890           1125.0475        77.0325  10193721.6822         3.5650         3.0569

TIMING: 800  CPU: 118.002, 0.104987/step  Wall: 118.002, 0.104987/step, 0.122485 hours remaining, 809.902344 MB of memory in use.
ENERGY:     800    360504.1138    280804.0922     82052.0878      5017.1543       -4511471.5475    383214.3057         0.0000         0.0000    946597.3980       -2453282.3958       297.7292  -3399879.7938  -2444652.9520       298.0805            978.4130        67.0123  10193578.8030        -0.1088         0.2596

TIMING: 1000  CPU: 144.765, 0.133817/step  Wall: 144.765, 0.133817/step, 0.148686 hours remaining, 809.902344 MB of memory in use.
ENERGY:    1000    361525.2450    280225.2207     81922.0613      5126.4104       -4513315.2802    383460.2355         0.0000         0.0000    947232.8722       -2453823.2352       297.9291  -3401056.1074  -2445219.8163       297.9184            823.8756        43.2552  10193174.7961        -0.7191        -0.2392
...

Job Cancellation

To stop a job before it finishes or remove it from a queue, use the qdel command:

$ qdel myjobid

You find the job ID using the qstat command as explained in the PBS Job Status section.

Examples

To submit jobs successfully, you must understand how to request the right computing resources. This section contains examples of specific types of PBS jobs. These examples illustrate requesting various groupings of nodes and processor cores, using various parallel libraries, and running interactive jobs. You may wish to look here for an example that is most similar to your application and use a modified version of that example's job submission file for your jobs.

Serial

A serial job is a single process whose steps execute as a sequential stream of instructions on one processor core.

This section illustrates how to use PBS to submit to a batch session one of the serial programs compiled in the section Compiling Serial Programs. There is no difference in running a Fortran, C, or C++ serial program after compiling and linking it into an executable file.

Suppose that you named your executable file serial_hello. Prepare a job submission file with an appropriate filename, here named serial_hello.sub:

#!/bin/sh -l
# FILENAME:  serial_hello.sub

module load devel
cd $PBS_O_WORKDIR

./serial_hello

Since PBS always sets the working directory to your home directory, you should either execute the cd $PBS_O_WORKDIR command, which will set the run-time current working directory to the directory from which you submitted the job submission file via the qsub command, or give the full path to the directory containing the executable program.

Submit the serial job to the default queue on Radon and request 1 compute node with 1 processor core and 1 minute of wall time. Requesting the default queue does not require explicitly asking for it. Job completion can take a while depending on the demand placed on the compute cluster:

$ qsub -l nodes=1:ppn=1,walltime=00:01:00 ./serial_hello.sub

View two new files in your directory (.o and .e):

$ ls -l
serial_hello
serial_hello.c
serial_hello.sub
serial_hello.sub.emyjobid
serial_hello.sub.omyjobid

View results in the output file:

$ cat serial_hello.sub.omyjobid
Runhost:radon-a639.rcac.purdue.edu   hello, world

If the job failed to run, then view error messages in the file serial_hello.sub.emyjobid.

If a serial job uses a lot of memory and finds the memory of a compute node overcommitted while sharing the compute node with other jobs, specify the number of processor cores physically available on the compute node to gain exclusive use of the compute node:

$ qsub -l nodes=1:ppn=8,walltime=00:01:00 serial_hello.sub

View results in the output file:

$ cat serial_hello.sub.omyjobid
Runhost:radon-a639.rcac.purdue.edu   hello, world

OpenMP

A shared-memory job is a single process that takes advantage of a multi-core processor and its shared memory to achieve a form of parallel computing called multithreading. It distributes the work of a process over several processor cores of a multi-core processor. Open Multi-Processing (OpenMP) is a specific implementation of the shared-memory model and is a collection of parallelization directives, library routines, and environment variables.

This section illustrates how to use PBS to submit to a batch session one of the OpenMP programs, either task parallelism or loop-level (data) parallelism, compiled in the section Compiling OpenMP Programs. There is no difference in running a Fortran, C, or C++ OpenMP program after compiling and linking it into an executable file.

The OpenMP runtime library automatically creates the optimal number of threads for execution in parallel on the multiple processor cores of a compute node. If you are running the program on a system with only one processor, you will not see any speedup. In fact, the program may run more slowly due to the overhead in the synchronization code generated by the compiler. For best performance, the number of threads should typically be equal to the number of processor cores you will be using.

When running OpenMP programs, all threads should be on the same compute node to take advantage of shared memory.

To run an OpenMP program, set the environment variable OMP_NUM_THREADS to the desired number of threads:

In csh:

$ setenv OMP_NUM_THREADS mynumberofthreads

In bash:

$ export OMP_NUM_THREADS=mynumberofthreads

You should also set the environment variable PARALLEL to 1. This variable must be set or else any timers used by the program will return incorrect timings (see the etime man page for more details).

Suppose that you named your executable file omp_hello. Prepare a job submission file with an appropriate name, here named omp_hello.sub:

#!/bin/sh -l
# FILENAME:  omp_hello.sub

module load devel
cd $PBS_O_WORKDIR
export OMP_NUM_THREADS=8

./omp_hello

Since PBS always sets the working directory to your home directory, you should either execute the cd $PBS_O_WORKDIR command, which will set the run-time current working directory to the directory from which you submitted the job submission file via the qsub command, or give the full path to the directory containing the program.

Submit the OpenMP job to the default queue on Radon and request 1 complete compute node with all 8 processor cores (OpenMP threads) on the compute node and 1 minute of wall time. This will use one complete compute node of the Radon cluster. Requesting the default queue does not require explicitly asking for it. Job completion can take a while depending on the demand placed on the compute cluster.

$ qsub -l nodes=1:ppn=8,walltime=00:01:00 omp_hello.sub

View two new files in your directory (.o and .e):

$ ls -l
omp_hello
omp_hello.c
omp_hello.sub
omp_hello.sub.emyjobid
omp_hello.sub.omyjobid

View the results from one of the sample OpenMP programs about task parallelism:

$ cat omp_hello.sub.omyjobid
SERIAL REGION:     Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 1 thread    hello, world
PARALLEL REGION:   Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 8 threads   hello, world
PARALLEL REGION:   Runhost:radon-c044.rcac.purdue.edu   Thread:1 of 8 threads   hello, world
   ...
PARALLEL REGION:   Runhost:radon-c044.rcac.purdue.edu   Thread:7 of 8 threads   hello, world
SERIAL REGION:     Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 1 thread    hello, world

If the job failed to run, then view error messages in the file omp_hello.sub.emyjobid.

If an OpenMP program uses a lot of memory and 8 threads overcommit the memory of the compute node, specify fewer processor cores (OpenMP threads) on that compute node.

Modify the job submission file omp_hello.sub to use half the number of processor cores:

#!/bin/sh -l
# FILENAME:  omp_hello.sub

module load devel
cd $PBS_O_WORKDIR
export OMP_NUM_THREADS=4

./omp_hello

Submit the job to the default queue with half the number of processor cores:

$ qsub -l nodes=1:ppn=4,walltime=00:01:00 omp_hello.sub

View the results from one of the sample OpenMP programs about task parallelism and using half the number of processor cores:

$ cat omp_hello.sub.omyjobid

SERIAL REGION:     Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 1 thread    hello, world
PARALLEL REGION:   Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 4 threads   hello, world
PARALLEL REGION:   Runhost:radon-c044.rcac.purdue.edu   Thread:1 of 4 threads   hello, world
   ...
PARALLEL REGION:   Runhost:radon-c044.rcac.purdue.edu   Thread:3 of 4 threads   hello, world
SERIAL REGION:     Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 1 thread    hello, world

To retain exclusive use of a compute node while using fewer OpenMP threads than the number of processor cores physically available on that compute node:

#!/bin/sh -l
# FILENAME:  omp_hello.sub

module load devel
cd $PBS_O_WORKDIR
export OMP_NUM_THREADS=8

./omp_hello

$ qsub -l nodes=1:ppn=16,walltime=00:01:00 omp_hello.sub

SERIAL REGION:     Runhost:radon-a639.rcac.purdue.edu   Thread:0 of 1 thread    hello, world
PARALLEL REGION:   Runhost:radon-a639.rcac.purdue.edu   Thread:0 of 8 threads   hello, world
PARALLEL REGION:   Runhost:radon-a639.rcac.purdue.edu   Thread:1 of 8 threads   hello, world
PARALLEL REGION:   Runhost:radon-a639.rcac.purdue.edu   Thread:2 of 8 threads   hello, world
PARALLEL REGION:   Runhost:radon-a639.rcac.purdue.edu   Thread:3 of 8 threads   hello, world
PARALLEL REGION:   Runhost:radon-a639.rcac.purdue.edu   Thread:4 of 8 threads   hello, world
PARALLEL REGION:   Runhost:radon-a639.rcac.purdue.edu   Thread:5 of 8 threads   hello, world
PARALLEL REGION:   Runhost:radon-a639.rcac.purdue.edu   Thread:6 of 8 threads   hello, world
PARALLEL REGION:   Runhost:radon-a639.rcac.purdue.edu   Thread:7 of 8 threads   hello, world
SERIAL REGION:     Runhost:radon-a639.rcac.purdue.edu   Thread:0 of 1 thread    hello, world

Practice submitting the sample OpenMP program about loop-level (data) parallelism:

#!/bin/sh -l
# FILENAME:  omp_loop.sub

module load devel
cd $PBS_O_WORKDIR
export OMP_NUM_THREADS=8

./omp_loop

$ qsub -l nodes=1:ppn=8,walltime=00:01:00 omp_loop.sub

SERIAL REGION:   Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 1 thread    hello, world
PARALLEL LOOP:   Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 8 threads   Iteration:0  hello, world
PARALLEL LOOP:   Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 8 threads   Iteration:1  hello, world
PARALLEL LOOP:   Runhost:radon-c044.rcac.purdue.edu   Thread:1 of 8 threads   Iteration:2  hello, world
PARALLEL LOOP:   Runhost:radon-c044.rcac.purdue.edu   Thread:1 of 8 threads   Iteration:3  hello, world
   ...
PARALLEL LOOP:   Runhost:radon-c044.rcac.purdue.edu   Thread:7 of 8 threads   Iteration:14  hello, world
PARALLEL LOOP:   Runhost:radon-c044.rcac.purdue.edu   Thread:7 of 8 threads   Iteration:15  hello, world
SERIAL REGION:   Runhost:radon-c044.rcac.purdue.edu   Thread:0 of 1 thread    hello, world

MPI

A message-passing job is a set of processes (often multiple copies of a single process) that take advantage of distributed-memory systems by communicating with each other via the sending and receiving of messages. Work occurs across several compute nodes of a distributed-memory system. The Message-Passing Interface (MPI) is a specific implementation of the message-passing model and is a collection of library functions. OpenMPI, MPICH2, and Intel MPI (IMPI) are implementations of the MPI standard.

This section illustrates how to use PBS to submit to a batch session one of the MPI programs compiled in the section Compiling MPI Programs. There is no difference in running a Fortran, C, or C++ serial program after compiling and linking it into an executable file.

The path to relevant MPI libraries is not setup on any run host by default. Using module load is the preferred way to access these libraries. Use module avail to see all software packages installed on Radon, including MPI library packages. Then, to employ one of the available MPI modules, enter the module load command.

Suppose that you named your executable file mpi_hello. Prepare a job submission file with an appropriate filename, here named mpi_hello.sub:

#!/bin/sh -l
# FILENAME:  mpi_hello.sub

module load devel
cd $PBS_O_WORKDIR

mpiexec -n 16 ./mpi_hello

You can load any MPI library/compiler module that is available on Radon (This example uses the recommended library Open MPI).

Since PBS always sets the working directory to your home directory, you should either execute the cd $PBS_O_WORKDIR command, which will set the run-time current working directory to the directory from which you submitted the job submission file via the qsub command, or give the full path to the directory containing the executable program.

You invoke an MPI program with the mpiexec command. The number of processes requested with mpiexec -n is usually equal to the number of MPI ranks of the application and should typically be equal to the total number of processor cores you request from PBS (more on this below).

Submit the MPI job to the default queue on Radon and request 2 compute nodes with all 8 processor cores and 8 MPI ranks on each compute node and 1 minute of wall time. This will use two complete compute nodes of the Radon cluster. Requesting the default queue does not require explicitly asking for it. Job completion can take a while depending on the demand placed on the compute cluster.

$ qsub -l nodes=2:ppn=8,walltime=00:01:00 ./mpi_hello.sub

View two new files in your directory (.o and .e):

$ ls -l
mpi_hello
mpi_hello.c
mpi_hello.sub
mpi_hello.sub.emyjobid
mpi_hello.sub.omyjobid

View results in the output file:

$ cat mpi_hello.sub.omyjobid
Runhost:radon-a010.rcac.purdue.edu   Rank:0 of 16 ranks   hello, world
Runhost:radon-a010.rcac.purdue.edu   Rank:1 of 16 ranks   hello, world
   ...
Runhost:radon-a010.rcac.purdue.edu   Rank:7 of 16 ranks   hello, world
Runhost:radon-a011.rcac.purdue.edu   Rank:8 of 16 ranks   hello, world
Runhost:radon-a011.rcac.purdue.edu   Rank:9 of 16 ranks   hello, world
   ...
Runhost:radon-a011.rcac.purdue.edu   Rank:15 of 16 ranks   hello, world

If the job failed to run, then view error messages in the file mpi_hello.sub.emyjobid.

If an MPI job uses a lot of memory and 8 MPI ranks per compute node overcommit the memory of the compute nodes, specify more compute nodes (MPI ranks) and fewer processor cores on each compute node, while keeping the total number of MPI ranks unchanged.

Submit the job to the default queue with double the number of compute nodes and half the number of processor cores and MPI ranks per compute node (the total number of MPI ranks remains unchanged):

$ qsub -l nodes=4:ppn=4,walltime=00:01:00 ./mpi_hello.sub

View results in the output file:

$ cat mpi_hello.sub.omyjobid
Runhost:radon-c010.rcac.purdue.edu   Rank:0 of 16 ranks   hello, world
Runhost:radon-c010.rcac.purdue.edu   Rank:1 of 16 ranks   hello, world
   ...
Runhost:radon-c010.rcac.purdue.edu   Rank:3 of 16 ranks   hello, world
Runhost:radon-c011.rcac.purdue.edu   Rank:4 of 16 ranks   hello, world
Runhost:radon-c011.rcac.purdue.edu   Rank:5 of 16 ranks   hello, world
   ...
Runhost:radon-c011.rcac.purdue.edu   Rank:7 of 16 ranks   hello, world
Runhost:radon-c012.rcac.purdue.edu   Rank:8 of 16 ranks   hello, world
Runhost:radon-c012.rcac.purdue.edu   Rank:9 of 16 ranks   hello, world
   ...
Runhost:radon-c012.rcac.purdue.edu   Rank:11 of 16 ranks   hello, world
Runhost:radon-c013.rcac.purdue.edu   Rank:12 of 16 ranks   hello, world
Runhost:radon-c013.rcac.purdue.edu   Rank:13 of 16 ranks   hello, world
   ...
Runhost:radon-c013.rcac.purdue.edu   Rank:15 of 16 ranks   hello, world

The example shares the computes nodes with other jobs. This sharing may still overcommit the memory.

To scatter 4 MPI ranks to 4 different compute nodes with each MPI rank having exclusive use of its compute node, apply the Linux command uniq to make a list of unique compute node names:

#!/bin/sh -l
# FILENAME:  mpi_hello.sub

module load devel
cd $PBS_O_WORKDIR
uniq <$PBS_NODEFILE >nodefile

mpiexec -n 4 -machinefile nodefile ./mpi_hello

$ qsub -l nodes=4:ppn=8,walltime=00:01:00 ./mpi_hello.sub

Runhost: radon-a637.rcac.purdue.edu   Rank: 0 of 4 ranks   hello, world
Runhost: radon-a636.rcac.purdue.edu   Rank: 1 of 4 ranks   hello, world
Runhost: radon-a634.rcac.purdue.edu   Rank: 2 of 4 ranks   hello, world
Runhost: radon-a633.rcac.purdue.edu   Rank: 3 of 4 ranks   hello, world

To distribute 8 MPI ranks to 4 different compute nodes with pairs of MPI ranks having exclusive use of their compute nodes, modify the output of uniq with pairs of compute node names:

#!/bin/sh -l
# FILENAME:  rankspernode

# For each unique compute node name, output two copies.
while read LINE; do
    echo $LINE
    echo $LINE
done

#!/bin/sh -l
# FILENAME:  mpi_hello.sub

module load devel
cd $PBS_O_WORKDIR
uniq <$PBS_NODEFILE | ./rankspernode >nodefile
                                                              
mpiexec -n 8 -machinefile nodefile ./mpi_hello

$ qsub -l nodes=4:ppn=8,walltime=00:01:00 ./mpi_hello.sub

Runhost: radon-a135.rcac.purdue.edu   Rank: 0 of 8 ranks   hello, world
Runhost: radon-a135.rcac.purdue.edu   Rank: 1 of 8 ranks   hello, world
Runhost: radon-a136.rcac.purdue.edu   Rank: 2 of 8 ranks   hello, world
Runhost: radon-a136.rcac.purdue.edu   Rank: 3 of 8 ranks   hello, world
Runhost: radon-a137.rcac.purdue.edu   Rank: 4 of 8 ranks   hello, world
Runhost: radon-a137.rcac.purdue.edu   Rank: 5 of 8 ranks   hello, world
Runhost: radon-a138.rcac.purdue.edu   Rank: 6 of 8 ranks   hello, world
Runhost: radon-a138.rcac.purdue.edu   Rank: 7 of 8 ranks   hello, world

Notes

  • In general, the exact order in which MPI ranks output similar write requests to an output file is random.
  • When you use mpiexec, PBS will cleanly kill tasks that exceed their assigned limits of CPU time, wall clock time, memory usage, or disk space.
  • Use qlist to determine which queues are available to you. The name of the queue which is available to everyone on Radon is "workq".
  • Invoking an MPI program on Radon with ./program is typically wrong, since this will use only one MPI process and defeat the purpose of using MPI. Unless that is what you want (rarely the case), you should use mpiexec to invoke an MPI program.

For an introductory tutorial on how to write your own MPI programs:

Running Jobs via HTHTCondor

HTCondor allows you to run jobs on systems which would otherwise be idle for however long their primary users do not need those systems. HTCondor is one of several distributed computing systems which ITaP makes available. Most ITaP research resources, in addition to being available through normal means, are a part of BoilerGrid and are accessible via HTCondor. If a primary user needs a processor core on a compute node, HTCondor immediately either checkpoints and/or migrates all HTCondor jobs on that compute node and makes that resource available to the primary user. Thus, shorter jobs will have a better completion rate via HTCondor than longer jobs; however, even though HTCondor may have to restart jobs elsewhere, BoilerGrid can offer a vast amount of computational resources to users. Not only are nearly all ITaP research systems part of BoilerGrid, so also are large numbers of lab machines at the West Lafayette and other Purdue campuses. BoilerGrid is one of the largest HTCondor pools in the world. Some machines at other institutions are also a part of a larger HTCondor federation known as DiaGrid and are available as well.

For more information:

Radon Frequently Asked Questions (FAQ)

There are currently no FAQs for Radon.