Black - Getting Started

Overview of Black

The Black cluster is Purdue's portion of the Indiana Economic Development Corporation (IEDC) machine at Indiana University, the IU portion of which is known as "Big Red". Black consists of 256 IBM JS21 Blades, each a Dual-Processor 2.5 GHz Dual-Core PowerPC 970 MP with 8 GB of RAM and PCI-X Myrinet 2000 interconnects. The large amount of shared memory in this system provides very fast communication between processors via shared memory, making this system ideal for large parallel jobs.

Detailed Hardware Specification

Number of Nodes Processor Cores per Node Memory per Node Interconnect TeraFlops
256 Dual-Processor 2.5 GHz Dual-Core PowerPC 970MP 4 8 GB PCI-X Myrinet 2000 5.12

Aside from Myrinet, Black nodes are also connected by Gigabit Ethernet to a 266 TB GPFS filesystem, hosted on 16 IBM p505 Power5 systems.

All Black nodes run SuSE Linux Enterprise Server 9 and use LoadLeveler 3.4.0 and Moab for resource and job management. Operating system patches are applied monthly or as security needs dictate. All nodes have been configured to allow for unlimited stack usage, as well as unlimited core dump size (though disk space and server quotas may still be a limiting factor).

Obtaining an Account

Purdue faculty and staff may obtain accounts on Black and Gray. If you are a Purdue affiliate, please use the online Research Computing Account Request Form.

You must enter your name and date of birth in the “Comments” area, which is on the last screen of the request.

In a few days, you should receive email indicating your account is ready and including an Indiana University ID number. You must then create a password/passphrase for your Black account using the online form at https://itaccounts.iu.edu/. Reserve 15 minutes to complete this process:

  1. Click on "Create my first IU computing accounts".
  2. Click "continue".
  3. Read the agreements, and click "continue" five times.
  4. If you agree to the terms of the Guidelines for Appropriate Usage, then type "Yes" and click "continue". If you do not agree, click "Exit".
  5. You will now be prompted to enter your last name, birth date, and Indiana University ID number received in your account notification email.
  6. You must now choose a passphrase. It must consist of 15-127 characters and four or more words (two or more distinct letters separated by one or more non-letters). The passphrase may not contain "@" or "#", and it should not be a common phrase (such as "to be or not to be" or "april showers bring may flowers"). The passphrase should not be based on predictable patterns such as the alphabet or the layout of a standard keyboard, and it must not contain your real name or username.
  7. After you have chosen a passphrase, you may opt to enroll in the self-service reset system for the passphrase. If you do not do this, then you will need to contact the help desk if you ever forget your passphrase. If you choose to enroll in the self-service reset system, you will be asked to set up 3-10 questions and answers which you will be asked to verify if you ever forget your passphrase.
  8. Your account should now be configured. Allow up to 24 hours before you can access your account.

Login / SSH

To issue jobs on Black, log in to the front-end host black.rcac.purdue.edu via SSH. Note that you must use the password you created for Black when you obtained your account, not your Purdue career account password.

Black also supports GSI-SSH access using TeraGrid credentials.

Here is what an initial login to Black will look like. Note you will be asked to choose a shell and then to give your passphrase/password once again.

$ ssh myusername@black.rcac.purdue.edu
Warning: Permanently added the RSA host key for IP address '149.165.234.32' to the list of known hosts.
Password: 

  *********************************************************************
            Welcome to Indiana University's Big Red Cluster 
           Send questions, comments, etc. to hps-admin@iu.edu
  *********************************************************************

    BigRed message of the day here...

  *********************************************************************




Welcome to Big Red!

This program is run the very first time you log in
to Big Red to allow you to select your login shell.
If you are uncertain which shell to select, choose
bash (Bourne-again shell).

1) bash
2) tcsh
3) ksh
4) zsh
5) quit
Select 1-5: 1
Changing login shell for myusername.
Password: 
Shell changed.
Your shell has been changed to the Bourne-again shell
This will take effect on all nodes within 60 minutes
generating ssh file /N/u/myusername/BigRed/.ssh/id_rsa ...
Generating public/private rsa key pair.
Created directory '/N/u/myusername/BigRed/.ssh'.
Your identification has been saved in /N/u/myusername/BigRed/.ssh/id_rsa.
Your public key has been saved in /N/u/myusername/BigRed/.ssh/id_rsa.pub.
The key fingerprint is:
....
adding id to ssh file /N/u/myusername/BigRed/.ssh/authorized_keys
myusername@BigRed:/N/hd01/myusername/BigRed>

SSH Client Software

All access to the RCAC systems must be through secure (encrypted) connections. Standard telnet and FTP are not supported. SSH, SCP, and SFTP may be used instead.

Secure Shell or SSH is a way of establishing a secure channel between a local and a remote computer. It uses public-key cryptography to authenticate the remote computer and (optionally) to allow the remote computer to authenticate the user. It is usually used to log in to a remote machine and execute commands similar to telnet, but it also supports tunneling and forwarding of X11 or arbitrary TCP connections. The associated SFTP and SCP protocols may be used to transfer files. There are many SSH clients available, depending on the operating system you use.

Linux / Solaris / AIX / HP-UX / Unix:

  • "ssh", "sftp", and "scp" are pre-installed. Log in using ssh myusername@servername.

Microsoft Windows:

Mac OS X:

  • "ssh", "sftp", and "scp" are pre-installed. You may start a local terminal window from "Applications->Utilities". Log in using ssh myusername@servername.
  • MacSSH and MacSFTP
  • NiftyTelnet 1.1 SSH

SSH Keys

SSH can be used in conjunction with many different means of authentication. One popular authentication method is called Public Key Authentication (PKA). PKA is a method of establishing your identity to a remote computer using related sets of encryption data called keys. PKA is a more secure alternative to traditional password-based authentication with which you are probably familiar.

To employ PKA via SSH, you manually generate a keypair (also called SSH keys) in the location from where you wish to initiate a connection to a remote machine. This keypair consists of two text files, one which is called a private key and one which is called a public key. You keep the private key file confidential on your local machine or local home directory (hence the name "private" key). You then login to a remote machine (if possible) and append the corresponding public key text to the end of a specific file, or have a system administrator do so on your behalf. In future login attempts, the public and private keys are compared to verify your identity, which then grants you access to the remote machine.

As a user, you can create, maintain, and employ as many keypairs as you wish. If you connect to a computational resource from your work laptop, your work desktop, and your home desktop, you can create and employ keypairs on each. You can also create multiple keypairs on a single local machine to serve different purposes, such as establishing access to different remote machines, or establishing different types of access to a single remote machine. In short, PKA via SSH offers a secure but flexible means of identifying yourself to all kinds computational resources.

Passphrases and SSH Keys

When a you create a keypair, you are prompted to provide a passphrase for the private key. This passphrase is different than a password in a number of ways. First, a passphrase is, as the name implies, a phrase. It can include most types of characters, including spaces, and has no limits on length. Second, this passphrase is not transmitted to the remote machine for verification. It is used only to allow the use of your local private key and is specific to a specific local private key.

Perhaps you are wondering why you would need a private key passphrase at all when using PKA. If the private key is kept secure, why the need for a passphrase just to use it? Indeed, if the location of your private keys were always completely secure, a passphrase might not be needed. In reality, a number of situations could arise in which someone may improperly gain access to your private key files. In these situations, a passphrase offers another level of security for you, the user who created the keypair.

Think of the private key/passphrase combination as being analogous to your ATM card/PIN combination. The ATM card itself is the object that grants access to your important accounts, and as such, should be kept secure at all times—just as a private key should. But if you ever lose your wallet or your ATM card is stolen, you are glad that your PIN exists to offer you another level of protection. The same is true for a private key passphrase.

When you create a keypair, you should always provide a corresponding private key passphrase. For security purposes, avoid using phrases that would be guessed by automated programs (e.g. phrases that consist solely of words in English-language dictionaries). This passphrase can never be recovered if forgotten, so make note of it. There are only limited situations when the use of a non-passphrase-protected private key is warranted—conducting automated file backups is one such situation. If you need to use a non-passphrase-protected private key to conduct automated backups to Fortress, see the No-Passphrase SSH Keys section.

Passwords

When you set up your account on Black, you created a passphrase. When you change that, you should do so on https://passphrase.iu.edu/ since this will change it on most UITS systems, including webmail.

Note that it may take up to 20 minutes before the change is reflected on the systems. You should logout of all UITS systems before making the change, as to avoid any potential problems. The passphrase for Black is totally independent of the one you may have on other Purdue systems, and changing one will not affect the other.

There is not currently any requirement regarding how often you must change your password on Black, but for security reasons it would be a good to change it at least once every 6 months, preferably every 3 months.

All passwords should:

  • Be something you have never used as a password before, on this or any other system.
  • Be easy for you to remember and difficult for others to guess.
  • Be at least eight characters long.
  • Be a combination of upper and lowercase letters, numbers, and symbols.
  • TIP: Choose a sentence or song lyric and abbreviate it: "The dog Samson ate 4 new slippers!" = "TdSa4ns!"

Never share your password with another user or make your password known to anyone else. Systems staff will NEVER ask for your password, by email or otherwise.

Storage Options

File storage options on Black include home directories, scratch file systems, and /tmp. Each of these have different performance and intended uses. Home directories are backed up nightly, but scratch and /tmp are not and may be occasionally purged without warning. Below is more detail about each of these storage options.

Home Directories

Your home directory is the default directory you are placed in when you log in.

You should use this space for storing files you want to keep long term such as source code, scripts, input data sets, etc. It should also be used for files you want to keep and which you use often. Your home directory will physically reside on an NFS server connected via Gigabit Ethernet. You can find the path to your home directory by logging in, and typing "pwd":

$ pwd
/home/somepath/myusername

Note that your home directory on Black is not the same as your home directory on other RCAC systems. You may transfer data between Black and other RCAC systems using one of the programs mentioned in the File Transfer section.

Scratch Directories

Scratch directories are intended for short term file storage only.

Backups are not performed on the scratch directories and files there may be removed (purged) without warning. In the event of a disk crash or file purge, files in scratch directories can not be recovered. Please be sure to copy any important files to more permanent storage.

/tmp Directory

The /tmp directory is intended for temporary files that are used during the execution of a process or job or while you examine files created by your jobs. Used properly, /tmp may provide faster local storage to an active process than any other storage option. However, do not use it for longer-term storage or critical results.

Files stored in /tmp are not backed up and are removed automatically once they are more than 24 hours old, whenever space is low, or whenever the system is rebooted. In the event of a loss, files in /tmp can not be recovered, so use it only for files that can be recreated relatively easily.

Long-Term Storage

Long-term Storage or Permanent Storage is available to RCAC users on the DXUL/UniTree archival storage system, commonly referred to as "fortress". DXUL (DiskXtender for Unix and Linux) and UniTree are a software package that manages a hierarchical storage system. Program files, data files and any other files which are not used often, but which must be saved, can be put in permanent storage. Fortress currently has a 1.2 PB capacity. However, since two copies are retained of every file, the usable capacity is only 600 TB. Recently used files smaller than 0.5 MB have their primary copy stored on low-cost disks, but the second copy is on tape or optical disks. This provides a rapid restore time to the disk cache. However, the large latency to access a larger file (usually involving a copy from a tape cartridge) makes it unsuitable for use as active storage. In addition to poor performance, these two uses can cause severe problems with the system itself:

  • DO NOT store any actively used files on Fortress.
  • DO NOT store large collections of small files on Fortress.
Do not use Fortress as a second home directory. Instead, use tar or some similar archive tool to combine all the smaller files you wish to store into a single large file first.

However, fortress cannot be accessed directly from Black. Due to the distance, transferring extremely large amounts of data to and from fortress may not be very feasible. RCAC has more information about Fortress.

Environment Variables

On Black you may use SoftEnv, an environment management system, to customize your environment (specify the software packages you plan to use) using symbolic keywords. For more information about using SoftEnv on Black, refer to Indiana University's "Big Red" SoftEnv documention.

Use environment variables instead of actual paths whenever possible to avoid problems if the specific paths to any of these change. Some of the environment variables you should have are:

  • $USER: your username
  • $HOME: path to your home directory
  • $PWD: path to your current directory
  • $PATH: all directories searched for commands/applications
  • $HOSTNAME: name of the machine you are on
  • $SHELL: your current shell (bash, tcsh, csh, ksh)
  • $SSH_CLIENT: your local client's IP address
  • $TERM: type of terminal or terminal emulator being used

All environment variables begin with the dollar sign ($) and are all uppercase. These may be used on the command line or in any scripts in place of and in combination with hard-coded values:

$ ls $HOME
...

$ ls $HOME/myproject
...

$ ls $HOME/myproject/$HOSTNAME_data
...

You may find the value of any environment variable by using the "echo" command:

$ echo $HOME
/home/somepath/myusername

$ echo $SHELL
/usr/local/bin/tcsh

You may list the values of all environment variable using the "env" command:

$ env
USER=myusername
HOME=/home/ba01/u101/myusername
SHELL=/usr/local/bin/tcsh
...

You may create or overwrite an environment variable using either "export" or "setenv", depending on your shell:

  (for bash and sh)
$ export VARIABLE=value

  (for tcsh and csh)
$ setenv VARIABLE value

Storage Quotas / Limits

Your disk usage is limited on RCAC systems. However, each filesystem (scratch, home directory, etc.) may have a different limit. If you exceed the soft limit or quota, you will see warnings whenever writing to the disk that you are over quota, but the write will still succeed. If you exceed the hard limit or limit, your write will fail until you either remove other files or your quota is increased. Generally, RCAC systems do not impose a soft limit—only a hard limit.

Checking Quota Usage

You may find out what your current quota is by using the "quota" command:

$ quota
Disk quotas for user myusername (uid 12345): 
     Filesystem  blocks   quota   limit   grace   files   quota   limit   grace
     /N/fs8       36347   50000   55000            1178       0       0

The columns are as follows:

  1. Filesystem: This indicates the line is for the user's files on /u103/, which doing "echo $HOME" confirms is the user's home directory filesystem.
  2. Blocks: This shows how many 1 KB blocks the user's files take up. In this case, 2346272 KB / 1024 = 2291 MB, or 2291 MB / 1024 = 2.24 GB.
  3. Quota: This shows that soft limits are not being imposed (0).
  4. Limit: This shows how many 1 KB blocks the user's hard limit is. In this case, (5000000 KB / 1024) / 1024 = 4.77 GB.
  5. Grace: This would show the grace period (in days) for any soft limit (on Black, 5 days).
  6. Files: This shows how many file pointers (inodes) the user is currently using. This is based more on the number of files and directories and not the size.
  7. Quota: This shows that soft limits are not being imposed for file pointers (0).
  8. Limit: This shows the user's file pointer hard limit. It is possible, though unlikely, to hit this and not the size limit if you create a large number of very small files.
  9. Grace: This would show the grace period (in days) for any file pointer soft limit (none in this case).

You may also see the disk usage of any given directory by using "du":

$ du -hs
1.1G    .

$ du -hs $HOME
35M     /N/fs8/myusername

This can be very helpful in figuring out where your largest files or directories are, so that you may clean out unneeded large files and avoid hitting your quota.

Requesting Quota Increase

If you find you need additional disk space on Black, please first consider archiving and compressing old files. If this is not able to resolve the issue, you may contact the Indiana University High Performance Systems Group to request additional space.

Archive and Compression

There are several options for archiving and compressing groups of files or directories on RCAC systems. All of the following tools are provided:

  • zip   (more information)
    Simple compression and file packaging utility.
    Examples:
      (compress file somefile.c)
    $ zip somefile.zip somefile.c
    
      (extract contents of somefile.zip)
    $ unzip somefile.zip
    
      (compress all files in a directory into one archive file)
    $ zip -r somefile.zip somedirectory/
    
      (compress all ".c" files in current directory into one archive file)
    $ zip -r somefile.zip . -i \*.c
    
  • tar   (more information)
    Saves many files together into a single archive file, and restores individual files from the archive. Includes automatic archive compression/decompression options and special features that allow tar to be used for incremental and full backups.
    Examples:
      (archive file somefile.c)
    $ tar cvf somefile.tar somefile.c
    
      (archive and compress file somefile.c)
    $ tar czvf somefile.tar.gz somefile.c
    
      (list contents of archive somefile.tar)
    $ tar tvf somefile.tar
    
      (extract contents of somefile.tar)
    $ tar xvf somefile.tar
    
      (extract contents of gzipped archive somefile.tar.gz)
    $ tar xzvf somefile.tar.gz
    
      (archive and compress all files in a directory into one archive file)
    $ tar czvf somefile.tar.gz somedirectory/
    
      (archive and compress all ".c" files in current directory into one archive file)
    $ tar czvf somefile.tar.gz *.c 
    
  • gzip   (more information)
    Compression utility designed as a replacement for compress, with much better compression and no patented algorithms. The standard compression system for all GNU software.
    Examples:
      (compress file somefile - also removes uncompressed file)
    $ gzip somefile
    
      (uncompress file somefile.gz - also removes compressed file)
    $ gunzip somefile.gz
    
  • bzip2   (more information)
    Strong, lossless data compressor based on the Burrows-Wheeler transform. Also available as a library.
    Examples:
      (compress file somefile - also removes uncompressed file)
    $ bzip2 somefile
    
      (uncompress file somefile.bz2 - also removes compressed file)
    $ bunzip2 somefile.bz2
    
  • compress   (more information)
    Adaptive Lempel-Ziv compressor. Not often used today.

Windows users can work with these same formats using some of the following software:

  • 7-Zip
    Free Windows software package that can handle all the above formats.
  • WinZip
    Commercial Windows software package that can handle all the above formats.
  • WinRAR
    Commercial Windows software package that can handle all the above formats.

File Transfer

There are a variety of ways to transfer data to and from RCAC systems. Which you should use depends on several factors, including the ease of use for you personally, connection speed and bandwidth, the size and number of files to be transferred. For more details on file transfer methods and applications, refer to the Black Complete User Guide.

Provided Applications

Software packages on Black are loaded using SoftEnv. Indiana University hosts a list of available applications and how to load them.

Environment Management with the Module Command

The "module" command used on most RCAC systems is not available on Black. SoftEnv is provided instead.

Environment Management with SoftEnv

On Black, software packages are loaded using SoftEnv. Indiana University has more detailed instructions on how to use SoftEnv.

Provided Compilers

Compilers are available on Black for Fortran 77, Fortran 90, Fortran 95, C, and C++. The compilers can produce general-purpose and architecture-specific optimizations to improve performance. These include loop-level optimizations, inter-procedural analysis and cache optimizations. The compilers support automatic and user-directed parallelization of Fortran, C, and C++ applications for multiprocessing execution. More detailed documentation on each compiler set available on Black follows.

Compilation of serial programs for Black may be done on Gray, should you only have Condor access to Black.

Here is some more documentation from Indiana University about compilation on Black:

IBM Compiler Set

To use the IBM compiler set on Black, you need load no modules. The compiler programs will generally already be in your path. However, to compile MPI programs, you will need to load MPI support via softenv. Here are some examples:

Language Serial Program OpenMP Program
Fortran77
$ xlf_r myprogram.f -o myprogram
$ xlf_r -qsmp=omp myprogram.f -o myprogram
Fortran90
$ xlf90_r myprogram.f -o myprogram
$ xlf90_r -qsmp=omp myprogram.f -o myprogram
Fortran95
$ xlf95_r myprogram.f -o myprogram
$ xlf95_r -qsmp=omp myprogram.f -o myprogram
C
$ xlc_r myprogram.c -o myprogram
$ xlc_r -qsmp=omp myprogram.c -o myprogram
C++
$ xlC_r myprogram.cpp -o myprogram
$ xlC_r -qsmp=omp myprogram.cpp -o myprogram
MPI Program (32-bit) MPI Program (64-bit)
Fortran77
$ soft add +teragrid-dev
$ soft add +mpich-mx-ibm-32
$ mpif77 myprogram.f -o myprogram
$ soft add +teragrid-dev
$ soft add +mpich-mx-ibm-64
$ mpif77 -q64 myprogram.f -o myprogram
Fortran90
$ soft add +teragrid-dev
$ soft add +mpich-mx-ibm-32
$ mpif90 myprogram.f -o myprogram
$ soft add +teragrid-dev
$ soft add +mpich-mx-ibm-64
$ mpif90 -q64 myprogram.f -o myprogram
Fortran95 (not available) (not available)
C
$ soft add +teragrid-dev
$ soft add +mpich-mx-ibm-32
$ mpicc myprogram.c -o myprogram
$ soft add +teragrid-dev
$ soft add +mpich-mx-ibm-64
$ mpicc -q64 myprogram.c -o myprogram
C++
$ soft add +teragrid-dev
$ soft add +mpich-mx-ibm-32
$ mpiCC myprogram.cpp -o myprogram
$ soft add +teragrid-dev
$ soft add +mpich-mx-ibm-64
$ mpiCC -q64 myprogram.cpp -o myprogram

More information on compiler options can be found in the official man pages, which can be accessed using the "man" command, or online here:

Here is some more documentation from other sources on the IBM compilers:

GNU Compiler Set

To use the GNU compiler set on Black, you need load no modules. The compiler programs will already be in your path. Here are some examples:

Language Serial Program MPI Program OpenMP Program
Fortran77
$ gfortran myprogram.f -o myprogram
(not available) (not available)
Fortran90
$ gfortran myprogram.f90 -o myprogram
(not available) (not available)
Fortran95
$ gfortran myprogram.f95 -o myprogram
(not available) (not available)
C
$ gcc myprogram.c -o myprogram
(not available) (not available)
C++
$ g++ myprogram.cpp -o myprogram
(not available) (not available)

More information on compiler options can be found in the official man pages, which can be accessed using the "man" command, or online here:

Running Jobs on Black

There are two methods for submitting jobs to Black - to the job queue through loadleveler, and with Condor from Gray. Here we will look at submitting the jobs through loadleveler. These jobs may be serial, message-passing, or shared-memory in nature. As well, very small programs can just be run as normal. If you are running anything but the smallest program, you should submit it to the job queue.

Running Jobs via LoadLeveler

To submit jobs to the job queue, you need to use the program 'Loadleveler'. If you are used to using PBS, then there is a manual here about migrating from PBS to Loadleveler. To control job management, there is the program Moab. Click here for a user's manual to Moab.

Another good page for using Loadleveler is this.

Black LoadLeveler Tips

Hold a job temporarily:

llhold [job_id]

Resume job on hold:

llhold -r [job_id]

Potential problem for interactive jobs. If your login shell is csh or tcsh, the X11 server managing your display may not receive the correct X11 authority information (protocol and key-data) from xterm in this context. In that case you will have to open the server to the world by issuing the command:

xhost +

References

Migrating to LoadLeveler from PBS

If you have been using the PBS system to submit jobs, then this section should help you get started with LoadLeveler.

Common commands

PBS command LL command
Job submission qsub <jobscript> llsubmit <jobscript>
Job cancel qdel <job id>] llcancel <job id>
Job status qstat -u <username> llq -u <username>
Extended job status qstat -f <ob id> llq -l <job id>
Hold job (temporarily) qhold <job id> llhold <job id>
Resume job on hold qrls <job id> llhold -r <job id>
List usable queues qstat -Q llclass
Extended list of queues qstat -Qf llclass -l

Environment variables

PBS command LL command
Job ID $PBS_JOBID $LOADL_STEP_ID
Submission directory $PBS_O_WORKDIR $LOADL_STEP_INITDIR
Node/cpu list $PBS_NODEFILE $LOADL_PROCESSOR_LIST

Resource specifications

PBS command LL command
Nodes/'chunks' #PBS -l select=<# nodes> #@ node=<# nodes>
Processors #PBS -l ncpus=<# cpus> #@ tasks_per_node=<# tasks>
Wall clock limit #PBS -l walltime=[hh:mm:ss] #@ wall_clock_limit=[hh:mm:ss]
Standard output file #PBS -o <output filename> #@ output=<output filename>
Standard error file #PBS -e <error filename> #@ error=<error filename>
Queue #PBS -q <queue> #@ class=<queue>
Transfer environment #PBS -V #@ environment=COPY_ALL
Send email to #PBS -M <email> #@ notify_user=<email>
Job name #PBS -N <name> #@ job_name=<name>

Common Moab scheduler commands

Show running/queued jobs showq | less
Check job status checkjob <job id> OR checkjob -v <job id>
Show assumed start time showstart <job id>
Show fairshare information diagnose -f | less
Check node status checknode <nodename>
Show reservations showres

Much of the information in this section comes from IU's page: http://rc.uits.iu.edu/kb/index.php?kbID=avgl.

Black LoadLeveler Queues

The table below shows the queues which are available to Purdue users of Big Red via Black.

Name of queue Default nodes Max nodes Wall clock limit (default/max) Job CPU limit (default/max) Maximum slots Comments
PU_LOW 1 16 02:00:00/07:00:00 07:00:00/112:00:00 256 Low queue for Purdue
PU_MED 1 64 02:00:00/07:00:00 07:00:00/448:00:00 256 Med queue for Purdue
PU_HIGH 1 128 02:00:00/07:00:00 07:00:00/1792:00:00 512 High queue for Purdue
PU_WIDE 1 256 02:00:00/07:00:00 07:00:00/896:00:00 1024 Wide queue for Purdue

To see all queues, type llclass or llclass -l for more information. Here are an example:

user123@BigRed:~> llclass
Name                 MaxJobCPU     MaxProcCPU  Free   Max Description          
                    d+hh:mm:ss     d+hh:mm:ss Slots Slots                      
--------------- -------------- -------------- ----- ----- ---------------------
DEBUG                 04:00:00       00:15:00    16    16 Fast Debug Queue     
FAST                  04:00:00       00:15:00    16    16 Fast Debug Queue     
PU_LOW            112+00:00:00     7+00:00:00   228   256 Low queue for Purdue 
PU_MED            448+00:00:00     7+00:00:00   256   256 Med queue for Purdue 
PU_HIGH           896+00:00:00     7+00:00:00   256   512 High queue for Purdue
PU_WIDE          1792+00:00:00     7+00:00:00   740  1024 Wide queue for Purdue
LONG             1792+00:00:00    14+00:00:00   167  1456 Intermediate Queue for up to 32 nodes
MED              1792+00:00:00    14+00:00:00   167  1456 Intermediate Queue for up to 32 nodes
NORMAL           2048+00:00:00     2+00:00:00   288  1564 Big Queue for up to 256 nodes
BIG              2048+00:00:00     2+00:00:00   288  1564 Big Queue for up to 256 nodes
--------------------------------------------------------------------------------
"Maximum Slots" value of the class "FAST" is constrained by the MAX_STARTERS limit(s).
"Free Slots" values of the classes "FAST", "PU_LOW", "PU_WIDE", "LONG", "MED",
"NORMAL", "BIG" are constrained by the MAX_STARTERS limit(s).
user123@BigRed:~> 

Black LoadLeveler Submission Script

Example, Loadleveler Submission Script

# Specify which shell to use, will use owner's shell if none is given. 
# @ shell = bash

# Specify job type, default is serial (string). For an OpenMP or MPI job, use parallel.
# @ job_type = parallel

# Specify environment, COPY_ALL means all environment variables from your shell are copied.
# @ environment = COPY_ALL

# You can specify whether or not to have Loadleveler send you mail. (always|error|start|never|complete) 
# @ notification = complete 

# Specify the name of the queue (class).
# @ class = PU_LOW

# If you are charging to a project, specify the account name with this.
# @ account_no = abc

# Number of nodes to request (only for parallel programs).
# @ node = 4 

# Number of tasks per node for MPI programs. For OpenMP or serial programs, take 1 or 
# omit command for default of 1 task.
# @ tasks_per_node = 4

# Specify memory requirements (in MB). 
# @ requirements=(Memory >= 1024)

# Sets the limit for the time a job can run. Default is 30 min (00:30:00). 
# @ wall_clock_limit = 00:10:00

# Change to directory that job was submitted from, same as cd $PBS_O_WORKDIR - alternatively, 
# just specify the full path for the program to run.
cd $LOADL_STEP_INITDIR 

# For an OpenMP program, remember to set OMP_NUM_THREADS if you haven't exported your
# environment and set it there. The example below is for bash and asks for 4 threads.
set OMP_NUM_THREADS 4

# The program to run. Give the full path unless you specify to change to directory job was
# submitted from and was standing in said directory when submitting.
# @ job_name = ./omp_hello

# Specify the name of the output file.
# @ output = out.$(jobid)

# Specify the name of the file to write any error to.
# @ error = $(jobid).$(stepid).err

# Tell the system to put a copy of the job in the queue.
# @ queue

Job Command File Keywords Reference

These are the keywords you can use in a LoadLeveler job command file.

account_no
arguments
checkpoint
class
comment
core_limit
cpu_limit
data_limit
dependency
environment
error
executable
file_limit
group
hold
image_size
initialdir
input
job_cpu_limit
job_name
job_type
max_processors
min_processors
notification
notify_user
output
parallel_path
preferences
queue
restart
requirements
rss_limit
shell
stack_limit
startdate
stepname
user_priority
wall_clock_limit

LoadLeveler Job Submission

The command to submit the job submission file is the following:

user123@BigRed:~> llsubmit jobscript
llsubmit: Processed command file through Submit Filter: "/home/loadl/scripts/submit_filter.pl".
llsubmit: The job "s10c2b5.dim.826218" has been submitted.
user123@BigRed:~> 

LoadLeveler Job Status

Checking job status (for user):

llq -u [username]

Example

user123@BigRed:~> llq -u user123
Id                       Owner      Submitted   ST PRI Class        Running On 
------------------------ ---------- ----------- -- --- ------------ -----------
s10c2b5.826218.0         user123    12/3  14:06 I  50  PU_LOW                  

1 job step(s) in query, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted
user123@BigRed:~> 

Extended job status:

llq -l [job_id]

LoadLeveler Job Cancellation

Deleting the job:

llcancel [job_id]

LoadLeveler Interactive Jobs

To run an interactive session under LoadLeveler, you have to create a LoadLeveler job submission file for interactive use. All interactive parallel jobs must use a LoadLeveler job command file. This file contains a number of LoadLeveler keyword statements which specify the various requirements of the interactive job. The script looks very similar to the usual batch/background job submission file.

Batch jobs do not need to specify a job class, but interactive jobs do. This is done with

#@ class = [class_name]

You can see the available classes with llclass:

bbrydsx@BigRed:~> llclass
Name                 MaxJobCPU     MaxProcCPU  Free   Max Description          
                    d+hh:mm:ss     d+hh:mm:ss Slots Slots                      
--------------- -------------- -------------- ----- ----- ---------------------
DEBUG                 04:00:00       00:15:00    16    16 Fast Debug Queue     
ADMIN               1+08:00:00       02:00:00     0     0 Admin Queue          
SERIAL              8+00:00:00     2+00:00:00   174  3036 Serial backfill queue
PU_LOW            112+00:00:00     7+00:00:00   173   256 Low queue for Purdue 
PU_MED            448+00:00:00     7+00:00:00   160   256 Med queue for Purdue 
PU_HIGH           896+00:00:00     7+00:00:00    96   512 High queue for Purdue
PU_WIDE          1792+00:00:00     7+00:00:00   429  1024 Wide queue for Purdue
IEDC             1792+00:00:00    14+00:00:00   551  2048 IEDC Queue           
LONG             1792+00:00:00    14+00:00:00   111  1216 Intermediate Queue for up to 32 nodes
SPRUCE           1792+00:00:00    14+00:00:00    16    16 SPRUCE queue         
NORMAL           2048+00:00:00     2+00:00:00    63  1820 Big Queue for up to 256 nodes
--------------------------------------------------------------------------------
"Maximum Slots" value of the class "ADMIN" is constrained by the MAX_STARTERS limit(s).
"Free Slots" values of the classes "ADMIN", "SERIAL", "PU_LOW", "PU_MED", "PU_HIGH", 
"PU_WIDE", "IEDC", "LONG", "NORMAL" are constrained by the MAX_STARTERS limit(s).
bbrydsx@BigRed:~> 

For interactive jobs #@ node_usage = shared should be specified.

Here is an example of a job submission file to run an interactive session (serial job). Since it opens a xterm, you must make sure that the display is set properly to ensure that X-windows programs will be allowed access to your display.

#@ output = $(job_name).out
#
#@ error = $(job_name).err
#
#@ job_type = serial
#
#@ class = PU_LOW
#
#@ notification = never
#
#@ node_usage = shared
#
#@ environment = COPY_ALL
#
#@ executable = /usr/bin/xterm
#
#@ arguments = -ls -sb -sl 300
#
#@ queue

If you want to run a MPI/parallel job, set job_type = parallel and cpus = [wanted number of cpus]. You can set walltime with wall_clock_limit = hh:mm:ss.

A script should be submitted with the following command:

llsubmit [job_script]

You can run emacs with scripts like the above, using #@ executable = /usr/bin/emacs.

Getting the display to work:

As mentioned, you must have a workstation with X11 server and X11 authority running for that to work. You also need to define the display itself. The value of the DISPLAY will be passed on to xterm, because we use:

# @ environment = COPY_ALL

To find the values to set for your display to work, issue the command xauth list on your local term:

Example

user123@BigRed:~> xauth list
s10c2b11/unix:10  MIT-MAGIC-COOKIE-1  4f4bcf417d9d84592458f02e88eae05b
s10c2b12/unix:10  MIT-MAGIC-COOKIE-1  0a2361ad5d2f5f41e15f1d600cf5d3d3
user123@BigRed:~> 

Select the whole first line of the listing and switch to an xterm on Black/BigRed. There you should type

xauth add s10c2b12/unix:10  MIT-MAGIC-COOKIE-1  0a2361ad5d2f5f41e15f1d600cf5d3d3

Change to your own values, of course!

Then define the display itself:

export DISPLAY=s10c2b12/unix:10

and you can submit the job with

llsubmit [jobscript]

To test that it actually works, you can try opening xclock:

xclock -display s10c2b12/unix:10.0 &

More information can be found here: http://beige.ucs.indiana.edu/gustav/ll-hints.html#interactive and here: http://beige.ucs.indiana.edu/B673/node93.html

LoadLeveler Examples

In these examples I will use this simple script

#@ output = $(job_id).out 
#@ error = $(job_id).err 
#@ job_type = serial
#@ class = PU_LOW 
#@ notification = never
#@ executable = /N/u/user123/BigRed/hello
#@ environment = COPY_ALL
#@ queue

To just submit the above job submission file (called 'hello_script') issue the following command

user123@BigRed:~> llsubmit hello_script
llsubmit: Processed command file through Submit Filter: "/home/loadl/scripts/submit_filter.pl".
llsubmit: The job "s10c2b5.dim.581935" has been submitted.
user123@BigRed:~> 

Note that job submission file may have been called job script at other sites.

Then, after a little while, you will get one or two new files in your directory

user123@BigRed:~> ls
hello        hello_script               
hello.c      s10c2b5.dim.581935.out
user123@BigRed:~> 

Note that the corresponding .err file will only be created if there actually were errors from the run.

You can now look at the results

user123@BigRed:~> less s10c2b5.dim.581935.out 
Hello World!
user123@BigRed:~> 

Obtain information about jobs in the queue

user123@BigRed:~> llq 
Id                       Owner      Submitted   ST PRI Class        Running On 
------------------------ ---------- ----------- -- --- ------------ -----------
s10c2b5.560864.0         user456     7/29 16:42 R  50  PU_WIDE      s19c2b6    
s10c2b5.560866.0         user456     7/29 16:42 R  50  PU_WIDE      s19c3b13   
s10c2b5.560868.0         user456     7/29 16:43 R  50  PU_WIDE      s19c3b8    
s10c2b5.560871.0         user456     7/29 16:43 R  50  PU_WIDE      s19c4b2    
s10c2b5.560872.0         user456     7/29 16:43 R  50  PU_WIDE      s20c1b1    
...
s10c2b5.567034.0         user789     7/31 11:17 R  50  PU_LOW       s16c3b7    
s10c2b5.567035.0         user789     7/31 11:17 R  50  PU_LOW       s16c3b8    
s10c2b5.567036.0         user789     7/31 11:17 R  50  PU_LOW       s16c3b9    

1339 job step(s) in queue, 952 waiting, 0 pending, 386 running, 1 held, 0 preempted

llq -l will display a much longer and more detailed list.

Hold a job/release a job

user123@BigRed:~> llsubmit hello_script
llsubmit: Processed command file through Submit Filter: "/home/loadl/scripts/submit_filter.pl".
llsubmit: The job "s10c2b5.dim.581944" has been submitted.
user123@BigRed:~> llq -u user123
Id                       Owner      Submitted   ST PRI Class        Running On 
------------------------ ---------- ----------- -- --- ------------ -----------
s10c2b5.581944.0         user123     8/11 15:09 I  50  PU_LOW                  

1 job step(s) in query, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted
user123@BigRed:~> llhold s10c2b5.581944.0
llhold: Hold command has been sent to the central manager.
user123@BigRed:~> llq -u user123
Id                       Owner      Submitted   ST PRI Class        Running On 
------------------------ ---------- ----------- -- --- ------------ -----------
s10c2b5.581944.0         user123     8/11 15:09 H  50  PU_LOW                  

1 job step(s) in query, 0 waiting, 0 pending, 0 running, 1 held, 0 preempted
user123@BigRed:~> llhold -r s10c2b5.581944.0
llhold: Hold command has been sent to the central manager.
user123@BigRed:~> llq -u user123
Id                       Owner      Submitted   ST PRI Class        Running On 
------------------------ ---------- ----------- -- --- ------------ -----------
s10c2b5.581944.0         user123     8/11 15:09 I  50  PU_LOW                  

1 job step(s) in query, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted
user123@BigRed:~> 

Displaying a machine's status

bbrydsx@BigRed:~> llstatus 
Name                      Schedd InQ  Act Startd Run LdAvg Idle Arch      OpSys    
s10c1b1.dim               Down      0   0 Idle     0 0.00  9999 PPC64     Linux2   
s10c1b2.dim               Down      0   0 Idle     0 0.00  9999 PPC64     Linux2   
s10c1b3.dim               Down      0   0 Idle     0 0.00  9999 PPC64     Linux2   
s10c1b4.dim               Down      0   0 Idle     0 0.00  9999 PPC64     Linux2   
s10c2b5.dim               Avail  1332 379 None     0 0.07  9999 PPC64     Linux2   
s11c1b1.dim               Down      0   0 Busy     4 3.61  9999 PPC64     Linux2   
s11c1b10.dim              Down      0   0 Busy     4 4.74  9999 PPC64     Linux2   
s11c1b11.dim              Down      0   0 Busy     4 3.49  9999 PPC64     Linux2   
s11c1b12.dim              Down      0   0 Busy     4 4.40  9999 PPC64     Linux2   
...
s9c4b6.dim                Down      0   0 Busy     4 5.65  9999 PPC64     Linux2   
s9c4b7.dim                Down      0   0 Busy     4 4.62  9999 PPC64     Linux2   
s9c4b8.dim                Down      0   0 Busy     4 9.50  9999 PPC64     Linux2   
s9c4b9.dim                Down      0   0 Busy     4 4.67  9999 PPC64     Linux2   

PPC64/Linux2             1021 machines   1332  jobs   3237  running tasks
Total Machines           1021 machines   1332  jobs   3237  running tasks

The Central Manager is defined on s10c2b5.dim

The API scheduler is in use

The following machines are marked SUBMIT_ONLY
s10c2b1.dim
s10c2b2.dim
s10c2b3.dim
s10c2b4.dim
s10c2b6.dim

The following 4 machines are marked absent
s10c1b5.dim
s10c1b6.dim
s10c1b7.dim
s10c1b8.dim

This will give only the memory

llstatus -l |grep -E "Machine|Memory"

Cancelling a job

user123@BigRed:~> llsubmit hello_script
llsubmit: Processed command file through Submit Filter: "/home/loadl/scripts/submit_filter.pl".
llsubmit: The job "s10c2b5.dim.581947" has been submitted.
user123@BigRed:~> llq -u bbrydsx
Id                       Owner      Submitted   ST PRI Class        Running On 
------------------------ ---------- ----------- -- --- ------------ -----------
s10c2b5.581947.0         user123     8/11 15:13 I  50  PU_LOW                  

1 job step(s) in query, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted
user123@BigRed:~> llcancel s10c2b5.581947.0
llcancel: Cancel command has been sent to the central manager.
user123@BigRed:~> llq -u user123
Id                       Owner      Submitted   ST PRI Class        Running On 
------------------------ ---------- ----------- -- --- ------------ -----------
s10c2b5.581947.0         user123     8/11 15:13 CA 50  PU_LOW                  

0 job step(s) in query, 0 waiting, 0 pending, 0 running, 0 held, 0 preempted
user123@BigRed:~> 

Multiple jobs

You can submit multiple jobs from a single script by using several 'queue' statements. Note that LoadLeveler statements in effect for the first job are generally in effect for all subsequent jobs in the same job command file, unless overridden later.

Submitting jobs this way is very useful if you want to run the same executable with different input/output files. Here is an example of that.

#@ executable  = ./myprogram
#
#@ input      = myprogram.input_1
#@ output     = myprogram.out_1
#@ error      = myprogram.err_1
#@ queue
#
#@ input      = myprogram.input_2
#@ output     = myprogram.out_2
#@ error      = myprogram.err_2
#@ queue

Same as above, but using predefined LoadLeveler macros to generate different output files. Five jobs will be queued, each of which reads a unique input file and creates unique output and error files.

#@ executable = myprogram
#
#@ input      = myprogram.in.$(Process)
#@ output     = myprogram.out.$(Cluster).$(Process)
#@ error      = myprogram.err.$(Cluster).$(Process)
#@ queue
#@ queue
#@ queue
#@ queue
#@ queue

Term Commands as Executables

#!/bin/csh
#
# LoadLeveler commands
#@ initialdir = /N/u/user123/BigRed
#@ error      = run1.$(Cluster).err
#@ output     = run1.$(Cluster).out
#@ environment = MP_SHARED_MEMORY=yes
#@ queue

# Script commands
echo 'Copying input file to /scratch'
cp input.1  /scratch/input.1

echo 'Running the program'
myprogram

echo 'Copying output file back'
cp /scratch/output.1   output.1

rm /scratch/input.1
echo 'Cleanup done. Job completed.'
end

Serial LoadLeveler Example

#@ output = outfile.out
#@ error = errorfile.err
#
#@ job_type = serial
#
#@ class = PU_LOW
#
#@ notification = never
#
#@ environment = COPY_ALL
#
#@ executable = /N/u/user123/BigRed/hello
#
#@ queue

Important: You must assign values to "output" and "error" if your program writes to stdout and/or stderr. If not specified, these default to /dev/null.

If you want to make sure that the output/error from each job goes to a separate file, you can use the values assigned by LoadLeveler to the Executable, Cluster, and Process values.

Cluster: unique jobid

Process: assigned to each process queued within a script

This job submission file example executes a serial job twice, giving each a different output filename.

#@ output = $(Executable).$(Cluster).$(Process).out
#@ error = $(Executable).$(Cluster).$(Process).err
#
#@ job_type = serial
#
#@ class = PU_LOW
#
#@ notification = never
#
#@ environment = COPY_ALL
#
#@ executable = /N/u/user123/BigRed/myprogram
#
#@ arguments = args1 args2 args3
#@ queue
#
#@ arguments = args4 args5 args6
#@ queue

If no executable is specified, LoadLeveler will asume that anything following the #@ queue statement consists of commands to be executed. This can among other things be used to run shell commands, or if you need to run several commands in sequence. Here is an example:

#@ output = myprogram.$(Cluster).out
#@ error = myprogram.$(Cluster).err
#
#@ job_type = serial
#
#@ class = PU_LOW
#
#@ notification = never
#
#@ environment = COPY_ALL
#
#@ queue 
xlc /N/u/user123/BigRed/hello.c -o /N/u/user123/BigRed/hello 
/N/u/user123/BigRed/hello
rm /N/u/user123/BigRed/a.out

MPI LoadLeveler Example

Job submission file and running MPI jobs

To submit a parallel/MPI job, you have to add #@ job_type = parallel to the job submission file. There are a number of other parameters which can/should be set:

  • #@ cpus = [# cpus] to give number of cpus
  • #@ node = [# nodes] to give the number of nodes you want.
  • #@ tasks_per_node = [# tasks] is used to give the number of tasks wanted per node
  • #@ total_tasks = [# tasks] indicates the total number of tasks requested for the run - can be used with either #@ node or #@ tasks_per_node, but not both
  • #@ environment statement is used for miscellaneous environment variables. More common ones are:
    • MP_SHARED_MEMORY=yes
    • MP_INFOLEVEL=#value
    • MP_MP_SAVEHOSTFILE=myhosts.txt

To determine which nodes were used for your parallel execution, add

echo $LOADL_PROCESSOR_LIST > myhosts

to your job submission file or ask to have mail sent to you. It will include the nodes used. Turn it on by adding this to your job submission file

#@ notification = complete
#@ notify_user = user@address.domain

OR set the MP_INFOLEVEL environment variable to a value above 1 and look in the file were standard error is written

#@ error = myjob.err
#@ environment = MP_INFOLEVEL=2

Example MPI job submission file

#
#@ error           = myprogram.$(Cluster).$(process).err
#@ output          = myprogram.$(Cluster).$(process).out
#
# @ job_type = MPICH 
# @ account_no = NONE 
#
#@ notification    = complete
#@ notify_user     = user123@address.domain
#
#@ class = PU_MED
#
# @ environment=COPY_ALL; 
#
#@ node              = 4
#@ tasks_per_node    = 4
#
#@ wall_clock_limit= 15:00:00
#@ queue
#
## Users should always cd into their execution directory due to 
## a bug within LoadLeveler in dealing with the initialdir keyword. 
cd [execution directory]
## Use mpirun to execute your MPICH program in parallel; 
## $LOADL_TOTAL_TASKS and $LOADL_HOSTFILE are defined by 
## LoadLeveler for jobs of type MPICH. 
mpirun -np $LOADL_TOTAL_TASKS -machinefile $LOADL_HOSTFILE ./myprogram  

The above job submission file asks for 4 nodes, and 4 tasks on each node. The program is called 'myprogram'.

Submitting Parallel MPI Jobs

To submit a parallel job, you have three options:

  • Submit the parallel job using the parallel job submission file (simple MPI applications)
    • The parallel job submission file provides a convenient method for submitting parallel (multiple-processor) programs to the LoadLeveler batching and queuing system. Programs must consist of just one executable file, in contrast to some master/worker programs in which the master and workers are different executable files. To submit it, do
      paralleljob ./my_parallel_program
      

  • Run your job interactively with mpirun (for testing and debugging)
    • To access one of the interactive nodes, you must first log into Big Red and from there use ssh to connect to b509, b510, b511, or b512. Create a machinefile, by doing something like the following:
      cat mfile b509 b510 b509 b510 b509 b510 b509 b510
      

      Then use mpirun to run your parallel job: mpirun -np [# procs] -machinefile mfile program-name
  • Use LoadLeveler to submit a parallel job, if you need the advanced mpirun options
    • To submit a parallel job that runs your MPI program, edit the sample LoadLeveler script shown above/create your own, with the correct number of nodes and tasks, the appropriate output/error files, etc. Then use this command to submit
      llsubmit parallel_jobscript.sh
      

Note If you have multiple #@ environment statements, only the last will have effect. If you need to specify multiple environment variables, separate them by semicolons with a single #@ environment statment. For example:

#@ environment  = MP_Shared_MEMORY=yes;MP_INFOLEVEL=3;MP_LABELIO=yes

Also, do not use the #@ executable statement if you are running parallel jobs. Parallel jobs use the job submission file as the executable.

Most of this information was taken from http://rc.uits.iu.edu/kb/index.php?kbID=autn where more information can be found.