The Black cluster is Purdue's portion of the Indiana Economic Development Corporation (IEDC) machine at Indiana University, the IU portion of which is known as "Big Red". Black consists of 256 IBM JS21 Blades, each a Dual-Processor 2.5 GHz Dual-Core PowerPC 970 MP with 8 GB of RAM and PCI-X Myrinet 2000 interconnects. The large amount of shared memory in this system provides very fast communication between processors via shared memory, making this system ideal for large parallel jobs.
| Number of Nodes | Processor | Cores per Node | Memory per Node | Interconnect | TeraFlops |
|---|---|---|---|---|---|
| 256 | Dual-Processor 2.5 GHz Dual-Core PowerPC 970MP | 4 | 8 GB | PCI-X Myrinet 2000 | 5.12 |
Aside from Myrinet, Black nodes are also connected by Gigabit Ethernet to a 266 TB GPFS filesystem, hosted on 16 IBM p505 Power5 systems.
All Black nodes run SuSE Linux Enterprise Server 9 and use LoadLeveler 3.4.0 and Moab for resource and job management. Operating system patches are applied monthly or as security needs dictate. All nodes have been configured to allow for unlimited stack usage, as well as unlimited core dump size (though disk space and server quotas may still be a limiting factor).
The system interconnect is the networking technology that is used to connect nodes of a cluster to each other. Note that this is often much faster and sometimes radically different from the networking available between a resource and other machines or the outside world. Interconnects have different characteristics that may affect parallel message-passing programs and their design. Each RCAC resource has different interconnect options available, and some have more than one available to all or only portions of the resource's nodes. For information on which interconnects are available, refer to the hardware specification for the resource above. Details about the specific interconnects available on Black follow.
Myrinet is a high-speed local area networking system designed by Myricom and to be used as an interconnect between nodes in a cluster. Myrinet has better throughput, less interference, and lower latency than Ethernet due to a much smaller protocol overhead.
Physically, Myrinet consists of two fiberoptic cables (upstream and downstream) which are connected to each node. The nodes are not connected directly to each other, but through Myrinet routers or switches. There are some fault-tolerance features in Myrinet as well, including flow and error control.
Purdue faculty and staff may obtain accounts on Black and Gray. If you are a Purdue affiliate, please use the online Research Computing Account Request Form.
You must enter your name and date of birth in the “Comments” area, which is on the last screen of the request.In a few days, you should receive email indicating your account is ready and including an Indiana University ID number. You must then create a password/passphrase for your Black account using the online form at https://itaccounts.iu.edu/. Reserve 15 minutes to complete this process:
To issue jobs on Black, log in to the front-end host black.rcac.purdue.edu via SSH. Note that you must use the password you created for Black when you obtained your account, not your Purdue career account password.
Black also supports GSI-SSH access using TeraGrid credentials.
Here is what an initial login to Black will look like. Note you will be asked to choose a shell and then to give your passphrase/password once again.
$ ssh myusername@black.rcac.purdue.edu
Warning: Permanently added the RSA host key for IP address '149.165.234.32' to the list of known hosts.
Password:
*********************************************************************
Welcome to Indiana University's Big Red Cluster
Send questions, comments, etc. to hps-admin@iu.edu
*********************************************************************
BigRed message of the day here...
*********************************************************************
Welcome to Big Red!
This program is run the very first time you log in
to Big Red to allow you to select your login shell.
If you are uncertain which shell to select, choose
bash (Bourne-again shell).
1) bash
2) tcsh
3) ksh
4) zsh
5) quit
Select 1-5: 1
Changing login shell for myusername.
Password:
Shell changed.
Your shell has been changed to the Bourne-again shell
This will take effect on all nodes within 60 minutes
generating ssh file /N/u/myusername/BigRed/.ssh/id_rsa ...
Generating public/private rsa key pair.
Created directory '/N/u/myusername/BigRed/.ssh'.
Your identification has been saved in /N/u/myusername/BigRed/.ssh/id_rsa.
Your public key has been saved in /N/u/myusername/BigRed/.ssh/id_rsa.pub.
The key fingerprint is:
....
adding id to ssh file /N/u/myusername/BigRed/.ssh/authorized_keys
myusername@BigRed:/N/hd01/myusername/BigRed>
All access to the RCAC systems must be through secure (encrypted) connections. Standard telnet and FTP are not supported. SSH, SCP, and SFTP may be used instead.
Secure Shell or SSH is a way of establishing a secure channel between a local and a remote computer. It uses public-key cryptography to authenticate the remote computer and (optionally) to allow the remote computer to authenticate the user. It is usually used to log in to a remote machine and execute commands similar to telnet, but it also supports tunneling and forwarding of X11 or arbitrary TCP connections. The associated SFTP and SCP protocols may be used to transfer files. There are many SSH clients available, depending on the operating system you use.
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Mac OS X:
SSH can be used in conjunction with many different means of authentication. One popular authentication method is called Public Key Authentication (PKA). PKA is a method of establishing your identity to a remote computer using related sets of encryption data called keys. PKA is a more secure alternative to traditional password-based authentication with which you are probably familiar.
To employ PKA via SSH, you manually generate a keypair (also called SSH keys) in the location from where you wish to initiate a connection to a remote machine. This keypair consists of two text files, one which is called a private key and one which is called a public key. You keep the private key file confidential on your local machine or local home directory (hence the name "private" key). You then login to a remote machine (if possible) and append the corresponding public key text to the end of a specific file, or have a system administrator do so on your behalf. In future login attempts, the public and private keys are compared to verify your identity, which then grants you access to the remote machine.
As a user, you can create, maintain, and employ as many keypairs as you wish. If you connect to a computational resource from your work laptop, your work desktop, and your home desktop, you can create and employ keypairs on each. You can also create multiple keypairs on a single local machine to serve different purposes, such as establishing access to different remote machines, or establishing different types of access to a single remote machine. In short, PKA via SSH offers a secure but flexible means of identifying yourself to all kinds computational resources.
When a you create a keypair, you are prompted to provide a passphrase for the private key. This passphrase is different than a password in a number of ways. First, a passphrase is, as the name implies, a phrase. It can include most types of characters, including spaces, and has no limits on length. Second, this passphrase is not transmitted to the remote machine for verification. It is used only to allow the use of your local private key and is specific to a specific local private key.
Perhaps you are wondering why you would need a private key passphrase at all when using PKA. If the private key is kept secure, why the need for a passphrase just to use it? Indeed, if the location of your private keys were always completely secure, a passphrase might not be needed. In reality, a number of situations could arise in which someone may improperly gain access to your private key files. In these situations, a passphrase offers another level of security for you, the user who created the keypair.
Think of the private key/passphrase combination as being analogous to your ATM card/PIN combination. The ATM card itself is the object that grants access to your important accounts, and as such, should be kept secure at all times—just as a private key should. But if you ever lose your wallet or your ATM card is stolen, you are glad that your PIN exists to offer you another level of protection. The same is true for a private key passphrase.
When you create a keypair, you should always provide a corresponding private key passphrase. For security purposes, avoid using phrases that would be guessed by automated programs (e.g. phrases that consist solely of words in English-language dictionaries). This passphrase can never be recovered if forgotten, so make note of it. There are only limited situations when the use of a non-passphrase-protected private key is warranted—conducting automated file backups is one such situation. If you need to use a non-passphrase-protected private key to conduct automated backups to Fortress, see the No-Passphrase SSH Keys section.
SSH supports tunneling of X11 (X-Windows). If you have an X11 server running on your local machine, you may use X11 applications on remote systems and have their graphical displays appear on your local machine. These X11 connections are tunneled and encrypted automatically by your SSH client. You will need to have a local X11 server running, but free and commercial X11 servers are available for various operating systems.
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Packages from the X11 group: X-startup-scripts XFree86-lib-compat xorg-* xterm xwinwm lib-glitz-glx1 Under the Graphics group, also select opengl, if you want OpenGL support.Then when the Cygwin X server is installed, start an xterm and type: XWin -multiwindow in it and then enter. You can now run your SSH client.
Mac OS X:
Once you are running an X11 server, you will need to enable X11 forwarding/tunneling in your SSH client:
Note that SSH will set the remote environment variable $DISPLAY to "localhost:XX.YY" when this is working correctly. If you had previously set your $DISPLAY environment variable to your local IP or hostname, you must remove any set/export/setenv of this variable from your login scripts. The environment variable $DISPLAY must be left as SSH sets it, which is to a random local port address. Setting $DISPLAY to an IP or hostname will not work.
When you set up your account on Black, you created a passphrase. When you change that, you should do so on https://passphrase.iu.edu/ since this will change it on most UITS systems, including webmail.
Note that it may take up to 20 minutes before the change is reflected on the systems. You should logout of all UITS systems before making the change, as to avoid any potential problems. The passphrase for Black is totally independent of the one you may have on other Purdue systems, and changing one will not affect the other.
There is not currently any requirement regarding how often you must change your password on Black, but for security reasons it would be a good to change it at least once every 6 months, preferably every 3 months.
All passwords should:
Never share your password with another user or make your password known to anyone else. Systems staff will NEVER ask for your password, by email or otherwise.
Your Indiana University account for Black does include email services. You may access this through IU's Web Mail Interface or using an IMAP client connecting to imap.iu.edu. The easiest solution for most Purdue affiliates will be to simply set your IU account to forward all email to your Purdue email address (or any other).
To set up forwarding of all IU (Black) email, you have two options:
or:
echo myusername@purdue.edu > ~/.forward
When your account is activated, your default shell will probably be set to tcsh—an enhanced version of the Berkeley UNIX C shell (csh). The tcsh shell is completely compatible with the standard csh, and all csh commands and scripts work unedited with tcsh. For more details on tcsh, enter "man tcsh" while logged in.
The other popular shell is GNU Bourne-Again SHell (bash), which is completely compatible with the Bourne shell (sh). For more details on bash, enter "man bash" while logged in.
To change your shell temporarily or to try out another shell, just type the shell name as a command ("bash", "tcsh", "ksh"). This will run the new shell as a subshell. To return to your original shell, simple type exit.
To permanently change your login shell, use the command chsh:
$ chsh -s bash
(or)
$ chsh -s tcsh
To see a list of all available shells:
$ chsh -l
The next time you log on, you will start in the new shell. However, you may switch back at any time.
File storage options on Black include home directories, scratch file systems, and /tmp. Each of these have different performance and intended uses. Home directories are backed up nightly, but scratch and /tmp are not and may be occasionally purged without warning. Below is more detail about each of these storage options.
Your home directory is the default directory you are placed in when you log in.
You should use this space for storing files you want to keep long term such as source code, scripts, input data sets, etc. It should also be used for files you want to keep and which you use often. Your home directory will physically reside on an NFS server connected via Gigabit Ethernet. You can find the path to your home directory by logging in, and typing "pwd":
$ pwd /home/somepath/myusername
Note that your home directory on Black is not the same as your home directory on other RCAC systems. You may transfer data between Black and other RCAC systems using one of the programs mentioned in the File Transfer section.
Scratch directories are intended for short term file storage only.
Backups are not performed on the scratch directories and files there may be removed (purged) without warning. In the event of a disk crash or file purge, files in scratch directories can not be recovered. Please be sure to copy any important files to more permanent storage.
The /tmp directory is intended for temporary files that are used during the execution of a process or job or while you examine files created by your jobs. Used properly, /tmp may provide faster local storage to an active process than any other storage option. However, do not use it for longer-term storage or critical results.
Files stored in /tmp are not backed up and are removed automatically once they are more than 24 hours old, whenever space is low, or whenever the system is rebooted. In the event of a loss, files in /tmp can not be recovered, so use it only for files that can be recreated relatively easily.
Long-term Storage or Permanent Storage is available to RCAC users on the DXUL/UniTree archival storage system, commonly referred to as "fortress". DXUL (DiskXtender for Unix and Linux) and UniTree are a software package that manages a hierarchical storage system. Program files, data files and any other files which are not used often, but which must be saved, can be put in permanent storage. Fortress currently has a 1.2 PB capacity. However, since two copies are retained of every file, the usable capacity is only 600 TB.
Recently used files smaller than 0.5 MB have their primary copy stored on low-cost disks, but the second copy is on tape or optical disks. This provides a rapid restore time to the disk cache. However, the large latency to access a larger file (usually involving a copy from a tape cartridge) makes it unsuitable for use as active storage.
In addition to poor performance, these two uses can cause severe problems with the system itself:
Do not use Fortress as a second home directory. Instead, use tar or some similar archive tool to combine all the smaller files you wish to store into a single large file first.
However, fortress cannot be accessed directly from Black. Due to the distance, transferring extremely large amounts of data to and from fortress may not be very feasible. RCAC has more information about Fortress.
There are a variety of ways to manually transfer files to your Fortress home directory for long-term storage.
You can use an SCP client to interactively transfer individual files and directories to Fortress. More information on SCP can be found in the File Transfer - SCP section of this guide.
You can use an SFTP client to interactively transfer individual files and directories to Fortress. More information on SFTP can be found in the File Transfer - SFTP section of this guide.
In the absence of NFS access to Fortress, you must login to fortress.rcac.purdue.edu to transfer files to long-term storage. There are limited situations where the use of a password or a passphrase-protected authentication keypair becomes impractical, and running scripted file backups to Fortress happens to be one of them. When you attempt to establish a connection to Fortress, you will literally be prompted to input a password or a local private key passphrase. Any time a script or automated process needs to establish the connection, it is unable to respond to such a request. To enable truly automated transfer of files to Fortress, you need to employ public key authentication via SSH with a non-passphrase-protected private key. For a conceptual overview of public key authentication, see the SSH Keys section of this guide.
Now, if your home directory is compromised and an attacker obtains your non-passphrase-protected private key, the attacker will be able to masquerade as you on machines that contain the corresponding public key. Luckily, certain usage restrictions can be customized for each keypair you employ. For example, you could create a non-passphrase-protected keypair and later specify that this public key shall only be used to run a file-backup script, and additionally, is only valid when connecting from a specific machine. Then, if the non-protected private key were to be compromised, the attacker would be saddened to realize that he could only run your file-backup script repeatedly.
It is very important to place a passphrase on all of your generated keypairs. Only use non-protected keypairs when absolutely necessary.
Here is how to set up a non-password-protected keypair for use with automated backup scripts to Fortress from Black.
$ ssh-keygen -t rsa -N "" -f ~/.ssh/mykeypairnameThe ssh-keygen command should have created the following files:
$ ls ~/.ssh/mykey* mykeypairname mykeypairname.pubThe first file is the private key. The second file is the public key counterpart.
from="*.rcac.purdue.edu",no-port-forwarding,no-agent-forwarding,no-X11-forwarding,no-ptyThis tells SSH to only allow connections from RCAC resources, to disable a number of forwarding functions, and to not allow interactive shell commands, respectively.
$ scp ~/.ssh/mykeypairname.pub myusername@fortress.rcac.purdue.edu:~/
$ ssh myusername@fortress.rcac.purdue.edu $ cd ~/.ssh/
$ chmod 600 ~/.ssh/authorized_keysIf it does not exist, create it:
$ touch ~/.ssh/authorized_keys $ chmod 600 ~/.ssh/authorized_keys
$ cat ~/mykeypairname.pub >> ~/.ssh/authorized_keys
$ cat ~/.ssh/authorized_keys from="*.rcac.purdue.edu",no-port-forwarding,no-agent-forwarding,no-X11-forwarding, no-pty ssh-rsa AABBB3NzaC1yc2EABBABIwAAAIEA3SXgmvos4jFLVFLRrh6YrN3s8FuBOUTCJ0NIsc+ FtFrSGD2bVV6yMCgpdgz9RZS7U5uTJOW2VBWsJSb6cjjnA2WJzDcS0bEU3lw+TJszv2sEfl/CwF6dyj2U2 k5VrXIpdosZVKyjoqzQXhFicIRv1/ykdO8xp+qcgc09NbcyGhs= myusername@resource.rcac.purdue.edu
$ rm ~/mykeypairname.pub
$ exit
If you have followed the instructions in the No-Passphrase SSH Keys section to employ an unprotected SSH keypair between Black and Fortress, you can automate the backup process using backup scripts. Because of the restrictions you placed upon the public key, you cannot use this keypair to log on to an interactive SSH session on Fortress, but you can use it to send files from your Black home directory to Fortress via SCP, or to run local scripts that employ SCP.
Since you can have multiple private keys on Black (and a similarly, multiple public keys in any given "authorized_keys" file on Fortress), you always need to specify which keypair you intend to employ for a log-in attempt to Fortress. The most consistent way to do this is with SSH's "-o" flag. This passes options to configure SSH and can be used with all programs that use SSH for providing a secure connection (e.g. SCP, SFTP, and RSYNC).
To test automated SCP authentication from Black to Fortress, use the following command:
$ scp -o IdentityFile=~/.ssh/mykeypairname ./mylocalfile myusername@fortress.rcac.purdue.edu:~/myremotefile
If this works (i.e. you are not prompted for a passphrase or login password), you can move on to implementing a script using SCP commands like the one above.
While only you can ultimately decide the best approach for your automated backup process, the example scripts below show, in general, how to employ backup scripts on Black using SCP commands and public key authentication via SSH. The following bash script, named "fortress_backup_script_scp", uses SCP to recursively copy two directories on a user's Black home directory to the user's Fortress home directory:
#!/usr/local/bin/bash
# A script to use SCP to copy
# whole directories to Fortress
# Define some parameters
user=myusername
remotehost=fortress.rcac.purdue.edu
idfile=~/.ssh/mykeypairname
# Manually populate an array of directories on the
# local machine we wish to back up on Fortress
localdir[0]=~/mydir2backup
localdir[1]=~/mydir2backup_also
# Get the number of directories to be backed up
numdirs=${#localdir[*]}
count=1
# Loop over every entry in the "localdir" array to
# copy each directory recursively to a folder of
# the same name in our home directory on Fortress.
printf "\n>> Starting Secure Copy backup to Fortress\n"
for dir in "${localdir[@]}"
do
printf ">> Copying directory $dir to Fortress ($count of $numdirs)\n"
scp -r -o IdentityFile=$idfile $dir $user@$remotehost:~/
let count++
done
printf ">> Done...\n\n"
The output for this script is as follows:
$ ./fortress_backup_script_scp >> Starting Secure Copy backup to Fortress >> Copying directory /home/ba01/u100/myusername/mydir2backup to Fortress (1 of 2) bigfile2.tar.gz 100% 121MB 30.3MB/s 00:04 bigfile1.tar.gz 100% 121MB 40.5MB/s 00:03 >> Copying directory /home/ba01/u100/myusername/mydir2backup_also to Fortress (2 of 2) bigfile4.tar.gz 100% 121MB 40.5MB/s 00:03 bigfile3.tar.gz 100% 121MB 40.5MB/s 00:03 >> Done...
By using these techniques, you can automate your file backups to Fortress safely and efficiently.
If you have followed the instructions in the No-Passphrase SSH Keys section to employ an unprotected SSH keypair between Black and Fortress, you can automate the backup process using backup scripts. Because of the restrictions you placed upon the public key, you cannot use this keypair to log on to an interactive SSH session on Fortress, but you can use it to send files from your Black home directory to Fortress via SFTP or to run local scripts that employ SFTP.
Since you can have multiple private keys on Black (and similarly, multiple public keys in any given "authorized_keys" file on Fortress), you always need to specify which keypair you intend to employ for a log-in attempt to Fortress. The most consistent way to do this is with SSH's "-o" flag. This passes options to configure SSH and can be used with all programs that use SSH for providing a secure connection (e.g. SCP, SFTP, and RSYNC).
To test automated SFTP authentication from Black to Fortress, use the following command:
$ sftp -o IdentityFile=~/.ssh/mykeypairname myusername@fortress.rcac.purdue.edu sftp> bye $
If this works (i.e. you are not prompted for a passphrase or login password), you can move on to implementing a script using SFTP commands like the one above.
While only you can ultimately decide the best approach for your automated backup process, the example scripts below show, in general, how to employ backup scripts on Black using SFTP commands and public key authentication via SSH. The following bash script, named "fortress_backup_script_sftp", uses SFTP commands to navigate through Fortress directories, and pushes files from the user's Black home directory when needed.
#!/usr/local/bin/bash # A script to use SFTP to push files to # Fortress for backup. # Set up some parameters user=myusername remotehost=fortress.rcac.purdue.edu idfile=~/.ssh/mykeypairname printf "\n>> Starting Secure FTP backup session to Fortress\n" # Invoke SFTP mode, specifying the correct private key, # and forcing batch file input from a "here-document" # (i.e. the rest of this script). sftp -o IdentityFile=$idfile -b - $user@$remotehost << EOF cd ./mydir2backup lcd ./mydir2backup put -P ./bigfile1.tar.gz put -P ./bigfile2.tar.gz cd ../mydir2backup_also lcd ../mydir2backup_also put -P ./bigfile3.tar.gz put -P ./bigfile4.tar.gz bye EOF # Now we are back to the bash shell... printf ">> Done...\n\n"
The output for this script is as follows:
$ ./fortress_backup_script_sftp >> Starting Secure FTP backup session to Fortress sftp> sftp> cd ./files2backup sftp> lcd ./files2backup sftp> sftp> put -P ./bigfile1.tar.gz Uploading ./bigfile1.tar.gz to /archive/fortress/home/myusername/mydir2backup/bigfile1.tar.gz sftp> put -P ./bigfile2.tar.gz Uploading ./bigfile2.tar.gz to /archive/fortress/home/myusername/mydir2backup/bigfile2.tar.gz sftp> sftp> cd ../files2backup_also sftp> lcd ../files2backup_also sftp> sftp> put -P ./bigfile3.tar.gz Uploading ./bigfile3.tar.gz to /archive/fortress/home/myusername/mydir2backup_also/bigfile3.tar.gz sftp> put -P ./bigfile4.tar.gz Uploading ./bigfile4.tar.gz to /archive/fortress/home/myusername/mydir2backup_also/bigfile4.tar.gz sftp> sftp> bye >> Done... $
By using these techniques, you can automate your file backups to Fortress safely and efficiently.
On Black you may use SoftEnv, an environment management system, to customize your environment (specify the software packages you plan to use) using symbolic keywords. For more information about using SoftEnv on Black, refer to Indiana University's "Big Red" SoftEnv documention.
Use environment variables instead of actual paths whenever possible to avoid problems if the specific paths to any of these change. Some of the environment variables you should have are:
All environment variables begin with the dollar sign ($) and are all uppercase. These may be used on the command line or in any scripts in place of and in combination with hard-coded values:
$ ls $HOME ... $ ls $HOME/myproject ... $ ls $HOME/myproject/$HOSTNAME_data ...
You may find the value of any environment variable by using the "echo" command:
$ echo $HOME /home/somepath/myusername $ echo $SHELL /usr/local/bin/tcsh
You may list the values of all environment variable using the "env" command:
$ env USER=myusername HOME=/home/ba01/u101/myusername SHELL=/usr/local/bin/tcsh ...
You may create or overwrite an environment variable using either "export" or "setenv", depending on your shell:
(for bash and sh) $ export VARIABLE=value (for tcsh and csh) $ setenv VARIABLE value
Your disk usage is limited on RCAC systems. However, each filesystem (scratch, home directory, etc.) may have a different limit. If you exceed the soft limit or quota, you will see warnings whenever writing to the disk that you are over quota, but the write will still succeed. If you exceed the hard limit or limit, your write will fail until you either remove other files or your quota is increased. Generally, RCAC systems do not impose a soft limit—only a hard limit.
You may find out what your current quota is by using the "quota" command:
$ quota
Disk quotas for user myusername (uid 12345):
Filesystem blocks quota limit grace files quota limit grace
/N/fs8 36347 50000 55000 1178 0 0
The columns are as follows:
You may also see the disk usage of any given directory by using "du":
$ du -hs 1.1G . $ du -hs $HOME 35M /N/fs8/myusername
This can be very helpful in figuring out where your largest files or directories are, so that you may clean out unneeded large files and avoid hitting your quota.
If you find you need additional disk space on Black, please first consider archiving and compressing old files. If this is not able to resolve the issue, you may contact the Indiana University High Performance Systems Group to request additional space.
There are several options for archiving and compressing groups of files or directories on RCAC systems. All of the following tools are provided:
(compress file somefile.c) $ zip somefile.zip somefile.c (extract contents of somefile.zip) $ unzip somefile.zip (compress all files in a directory into one archive file) $ zip -r somefile.zip somedirectory/ (compress all ".c" files in current directory into one archive file) $ zip -r somefile.zip . -i \*.c
(archive file somefile.c) $ tar cvf somefile.tar somefile.c (archive and compress file somefile.c) $ tar czvf somefile.tar.gz somefile.c (list contents of archive somefile.tar) $ tar tvf somefile.tar (extract contents of somefile.tar) $ tar xvf somefile.tar (extract contents of gzipped archive somefile.tar.gz) $ tar xzvf somefile.tar.gz (archive and compress all files in a directory into one archive file) $ tar czvf somefile.tar.gz somedirectory/ (archive and compress all ".c" files in current directory into one archive file) $ tar czvf somefile.tar.gz *.c
(compress file somefile - also removes uncompressed file) $ gzip somefile (uncompress file somefile.gz - also removes compressed file) $ gunzip somefile.gz
(compress file somefile - also removes uncompressed file) $ bzip2 somefile (uncompress file somefile.bz2 - also removes compressed file) $ bunzip2 somefile.bz2
Windows users can work with these same formats using some of the following software:
There are a variety of ways to transfer data to and from RCAC systems. Which you should use depends on several factors, including the ease of use for you personally, connection speed and bandwidth, the size and number of files to be transferred.
FTP (File Transfer Protocol) is simple data transfer mechanism. FTP was not designed to provide secure communications, and so FTP is no longer supported on any RCAC systems. Most modern FTP clients support either SFTP or SCP however, which are similar, secure protocols for file transfer. Try using one of the other methods described here instead of FTP.
SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH (Secure SHell) protocol. You may use SCP to connect to any system where you have SSH (log-in) access. SCP is available as a protocol choice in some graphical file transfer programs and also as a command line program on most Linux, Unix, and Mac OS X systems. SCP can copy single files, but will also recursively copy directory contents if given a directory name.
Command-line usage:
(to a remote system from local) $ scp sourcefilename myusername@hostname:somedirectory/destinationfilename (from a remote system to local) $ scp myusername@hostname:somedirectory/sourcefilename destinationfilename (recursive directory copy to a remote system from local) $ scp sourcedirectory/ myusername@hostname:somedirectory/
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Mac OS X:
SFTP (Secure File Transfer Protocol) is a reliable way of transferring files between two machines. You may use SFTP to connect to most RCAC systems. SFTP is available as a protocol choice in some graphical file transfer programs and also as a command-line program on most Linux, Unix, and Mac OS X systems. SFTP has more features than SCP and allows for other operations on remote files, remote directory listing, and resuming interrupted transfers. Command-line SFTP cannot recursively copy directory contents; to do so, try using SCP or graphical SFTP client.
Command-line usage:
$ sftp -B buffersize myusername@hostname
(to a remote system from local)
sftp> put sourcefile somedir/destinationfile
sftp> put -P sourcefile somedir/
(from a remote system to local)
sftp> get sourcefile somedir/destinationfile
sftp> get -P sourcefile somedir/
sftp> exit
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Mac OS X:
LFTP is a command-line file-transfer program for Linux and Unix systems. It supports SFTP, HTTP, and HTTPS file-transfers. LFTP has additional features not provided by SFTP such as bandwidth throttling, transfer queues, and parallel transfers. It may be used interactively or scripted.
LFTP with parallel transfers can be much faster than SCP or SFTP, so its use is encouraged when possible.
LFTP is provided only on some RCAC systems. However, it is simply a client, so it is not needed on the remote machine involved in a transfer (the remote system need only support SFTP).
Interactive usage:
$ lftp myusername@hostname
(transfer all ".dat" files from remote system to local)
lftp :~> mget *.dat
(transfer "filename.dat" file from local system to remote)
lftp :~> put filename.dat
(transfer a directory and all contents from remote
system to local, using 5 connections in parallel)
lftp :~> mirror --parallel=5 remotedirectory localdirectory/
(transfer a directory and all contents from local
system to remote, using 8 connections in parallel)
lftp :~> mirror -R --parallel=8 localdirectory remotedirectory/
Batch usage:
(specify all actions on command line) $ lftp myusername@hostname -e "mget *.dat" (specify all actions in the script file "mytransfer.lftp") $ lftp myusername@hostname -f mytransfer.lftp
GridFTP is a fast method of transferring large files that uses Globus authentication credentials (x509 certificates). GridFTP is available on some RCAC resources, but only to users who are members of a Grid project, such as TeraGrid, NorthWest Indiana Computational Grid (NWICG), or Open Science Grid (OSG). Note that not all grids may access all RCAC resources.
For more information about how to use GridFTP, consult documentation for your participating grid.
Software packages on Black are loaded using SoftEnv. Indiana University hosts a list of available applications and how to load them.
The "module" command used on most RCAC systems is not available on Black. SoftEnv is provided instead.
On Black, software packages are loaded using SoftEnv. Indiana University has more detailed instructions on how to use SoftEnv.
Compilers are available on Black for Fortran 77, Fortran 90, Fortran 95, C, and C++. The compilers can produce general-purpose and architecture-specific optimizations to improve performance. These include loop-level optimizations, inter-procedural analysis and cache optimizations. The compilers support automatic and user-directed parallelization of Fortran, C, and C++ applications for multiprocessing execution. More detailed documentation on each compiler set available on Black follows.
Compilation of serial programs for Black may be done on Gray, should you only have Condor access to Black.
Here is some more documentation from Indiana University about compilation on Black:
To use the IBM compiler set on Black, you need load no modules. The compiler programs will generally already be in your path. However, to compile MPI programs, you will need to load MPI support via softenv. Here are some examples:
| Language | Serial Program | OpenMP Program |
|---|---|---|
| Fortran77 |
$ xlf_r myprogram.f -o myprogram |
$ xlf_r -qsmp=omp myprogram.f -o myprogram |
| Fortran90 |
$ xlf90_r myprogram.f -o myprogram |
$ xlf90_r -qsmp=omp myprogram.f -o myprogram |
| Fortran95 |
$ xlf95_r myprogram.f -o myprogram |
$ xlf95_r -qsmp=omp myprogram.f -o myprogram |
| C |
$ xlc_r myprogram.c -o myprogram |
$ xlc_r -qsmp=omp myprogram.c -o myprogram |
| C++ |
$ xlC_r myprogram.cpp -o myprogram |
$ xlC_r -qsmp=omp myprogram.cpp -o myprogram |
| MPI Program (32-bit) | MPI Program (64-bit) | |
|---|---|---|
| Fortran77 |
$ soft add +teragrid-dev $ soft add +mpich-mx-ibm-32 $ mpif77 myprogram.f -o myprogram |
$ soft add +teragrid-dev $ soft add +mpich-mx-ibm-64 $ mpif77 -q64 myprogram.f -o myprogram |
| Fortran90 |
$ soft add +teragrid-dev $ soft add +mpich-mx-ibm-32 $ mpif90 myprogram.f -o myprogram |
$ soft add +teragrid-dev $ soft add +mpich-mx-ibm-64 $ mpif90 -q64 myprogram.f -o myprogram |
| Fortran95 | (not available) | (not available) |
| C |
$ soft add +teragrid-dev $ soft add +mpich-mx-ibm-32 $ mpicc myprogram.c -o myprogram |
$ soft add +teragrid-dev $ soft add +mpich-mx-ibm-64 $ mpicc -q64 myprogram.c -o myprogram |
| C++ |
$ soft add +teragrid-dev $ soft add +mpich-mx-ibm-32 $ mpiCC myprogram.cpp -o myprogram |
$ soft add +teragrid-dev $ soft add +mpich-mx-ibm-64 $ mpiCC -q64 myprogram.cpp -o myprogram |
More information on compiler options can be found in the official man pages, which can be accessed using the "man" command, or online here:
Here is some more documentation from other sources on the IBM compilers:
To use the GNU compiler set on Black, you need load no modules. The compiler programs will already be in your path. Here are some examples:
| Language | Serial Program | MPI Program | OpenMP Program |
|---|---|---|---|
| Fortran77 |
$ gfortran myprogram.f -o myprogram |
(not available) | (not available) |
| Fortran90 |
$ gfortran myprogram.f90 -o myprogram |
(not available) | (not available) |
| Fortran95 |
$ gfortran myprogram.f95 -o myprogram |
(not available) | (not available) |
| C |
$ gcc myprogram.c -o myprogram |
(not available) | (not available) |
| C++ |
$ g++ myprogram.cpp -o myprogram |
(not available) | (not available) |
More information on compiler options can be found in the official man pages, which can be accessed using the "man" command, or online here:
Compilers for C, C++, and versions of Fortran are available. To see a Fortran 77 program with OpenMP commands: omp_hello_f77.f. To see a C program with OpenMP commands: omp_hello.c. See the table in the section IBM Compiler Set for how to compile your program. Any compiler flags accepted by xlf90/xlc compilers, can be used with OpenMP.
Example, compiling a OpenMP C program:
user123@BigRed:~> xlc_r -qsmp=omp omp_hello.c user123@BigRed:~>
Note that in general, the compilers will not output anything for a successful compilation.
Compilers for C, C++, and versions of Fortran are available. To see a Fortran 77 program with MPI commands: hello77.f. To see a C program with MPI commands: hello.c.
MPI libraries available through softenv
To be able to use the MPI libraries on Black/Big Red, use the softenv command and use grep to search for mpi.
Among the other keys listed, you should be able to see two packages named +mpich-mx-ibm-32 and mpich-mx-ibm-64:
user123@BigRed:~> softenv | grep mpi
@GA-mpich-ibm-32 32 bit Global Array MPICH compiled
@GA-mpich-ibm-64 64 bit Global Array MPICH compiled
@globus--4.0.1-mpich-mx-xlc Globus 4.0.1 mpicc64 flavor with xlc compiler
@mpich-g2-1.2.6e-mpich-mx-xlc MPICH-G2 1.2.6e with Globus 4.0.1 mpi, IBM compiler
+R-2.5.0-ibm-64 IBM compiler 64 bit R
...
+mpi-hmmer Parallel HMMer, uses mpich-mx-ibm-32
P +mpich-g2-1.2.6e-mpich-mx-xlc-r2
MPICH-G2 1.2.6e with Globus 4.0.1 mpi
+mpich-mx-1.2.7..1-gcc-32 MPICH MX 1.2.7..1 gcc 3.3.3 32-bit
+mpich-mx-1.2.7..1-gcc-4.1.1-64
+mpich-mx-1.2.7..1-gcc-64 MPICH MX 1.2.7..1 gcc 3.3.3 STATIC 64-bit
+mpich-mx-ibm-32 MPICH MX xlc/xlf 32bit
+mpich-mx-ibm-64 MPICH MX xlc/xlf 64bit
...
user123@BigRed:~>
The key mpich-mx-ibm-32, for example, indicates that it is an MPICH library compiled with the IBM XL compiler in 32-bit mode and that it communicates with the mx protocol. There are also OpenMPI installed. The list above has been shortened with ... since it was very long.
To use MPICH-MX, add the MPICH-MX key to softenv. Add +mpich-mx-ibm-32 or +mpich-mx-ibm-64 to your ~/.soft file, depending on whether you need a 32-bit or a 64-bit MPICH library. Your ~/.soft file should look similar to this:
user123@BigRed:~> less .soft # # This is the .soft file. # It is used to customize your environment by setting up environment # variables such as PATH and MANPATH. # To learn what can be in this file, use 'man softenv'. # # #+mpich-mx-ibm-64 +mpich-mx-ibm-32 @bigred @teragrid-basic @teragrid-dev ## Some more keys you may have at Black/Big Red user123@BigRed:~>
Remember to run resoft after making changes to the .soft file.
To see a table of how to compile, go to the section IBM Compiler Set.
Example, compile your MPI code
For example, to compile a 32-bit parallel C program (assuming you have +mpich-mx-ibm-32 in your ~/.soft file), you would use something like:
user123@BigRed:~> mpicc hello_mpi.c -o hello_mpi_c user123@BigRed:~>
To compile a 64-bit parallel Fortran 77 program (assuming you have +mpich-mx-ibm-64 in your ~/.soft file), you would use something like:
user123@BigRed:~> mpif77 hello_mpi.f -o hello_mpi_f77 user123@BigRed:~>
To compile a 32-bit parallel C program using Open MPI 1.1.1 (assuming you have +openmpi-1.1.1-xlc-8.0-32 in your ~/.soft file):
user123@BigRed:~> mpicc hello_mpi.c -o hello_mpi_c
Compilers for C, C++, and versions of Fortran are available. To see a hybrid C++ program with OpenMP/MPI commands: hybrid.cpp.
You need to make sure the MPI libraries are added through softenv. To add these, follow the description below.
Use the softenv command and use grep to search for mpi.
Among the other keys listed, you should be able to see two packages named +mpich-mx-ibm-32 and mpich-mx-ibm-64:
user123@BigRed:~> softenv | grep mpi
@GA-mpich-ibm-32 32 bit Global Array MPICH compiled
@GA-mpich-ibm-64 64 bit Global Array MPICH compiled
@globus--4.0.1-mpich-mx-xlc Globus 4.0.1 mpicc64 flavor with xlc compiler
@mpich-g2-1.2.6e-mpich-mx-xlc MPICH-G2 1.2.6e with Globus 4.0.1 mpi, IBM compiler
+R-2.5.0-ibm-64 IBM compiler 64 bit R
...
+mpi-hmmer Parallel HMMer, uses mpich-mx-ibm-32
P +mpich-g2-1.2.6e-mpich-mx-xlc-r2
MPICH-G2 1.2.6e with Globus 4.0.1 mpi
+mpich-mx-1.2.7..1-gcc-32 MPICH MX 1.2.7..1 gcc 3.3.3 32-bit
+mpich-mx-1.2.7..1-gcc-4.1.1-64
+mpich-mx-1.2.7..1-gcc-64 MPICH MX 1.2.7..1 gcc 3.3.3 STATIC 64-bit
+mpich-mx-ibm-32 MPICH MX xlc/xlf 32bit
+mpich-mx-ibm-64 MPICH MX xlc/xlf 64bit
...
user123@BigRed:~>
The key mpich-mx-ibm-32, for example, indicates that it is an MPICH library compiled with the IBM XL compiler in 32-bit mode and that it communicates with the mx protocol. There are also OpenMPI installed. The list above has been shortened with ... since it was very long.
To use MPICH-MX, add the MPICH-MX key to softenv. Add +mpich-mx-ibm-32 or +mpich-mx-ibm-64 to your ~/.soft file, depending on whether you need a 32-bit or a 64-bit MPICH library. Your ~/.soft file should look similar to this:
user123@BigRed:~> less .soft # # This is the .soft file. # It is used to customize your environment by setting up environment # variables such as PATH and MANPATH. # To learn what can be in this file, use 'man softenv'. # # #+mpich-mx-ibm-64 +mpich-mx-ibm-32 @bigred @teragrid-basic @teragrid-dev ## Some more keys you may have at Black/Big Red user123@BigRed:~>
Remember to run resoft after making changes to the .soft file.
To see a table of how to compile, go to the section IBM Compiler Set and look at the MPI compilers. Hybrid code are compiled the same way.
Example, compile your hybrid code
For example, to compile a 32-bit hybrid C++ program (assuming you have +mpich-mx-ibm-32 in your ~/.soft file), you would use something like:
user123@BigRed:~> mpicxx hybrid.cpp -o hello_mpi_c user123@BigRed:~>
Compiling and linking in IBM's Engineering and Scientific Subroutine Libraries (ESSL).
If you are using actual threads (like OpenMP or POSIX), the compile switch is -qesslsmp. Even serial code requires the _r suffix, as in xlf90_r -q64 -qessl... or xlc_r -qessl.... For xlC_r, you must add -lessl -qnocinc=/usr/include/essl as well to redefine the include files.
MPI libraries available through softenv
To be able to use the MPI libraries on Black/Big Red, use the softenv command and use grep to search for mpi.
Among the other keys listed, you should be able to see two packages named +mpich-mx-ibm-32 and mpich-mx-ibm-64:
user123@BigRed:~> softenv | grep mpi
@GA-mpich-ibm-32 32 bit Global Array MPICH compiled
@GA-mpich-ibm-64 64 bit Global Array MPICH compiled
@globus--4.0.1-mpich-mx-xlc Globus 4.0.1 mpicc64 flavor with xlc compiler
@mpich-g2-1.2.6e-mpich-mx-xlc MPICH-G2 1.2.6e with Globus 4.0.1 mpi, IBM compiler
+R-2.5.0-ibm-64 IBM compiler 64 bit R
...
+mpi-hmmer Parallel HMMer, uses mpich-mx-ibm-32
P +mpich-g2-1.2.6e-mpich-mx-xlc-r2
MPICH-G2 1.2.6e with Globus 4.0.1 mpi
+mpich-mx-1.2.7..1-gcc-32 MPICH MX 1.2.7..1 gcc 3.3.3 32-bit
+mpich-mx-1.2.7..1-gcc-4.1.1-64
+mpich-mx-1.2.7..1-gcc-64 MPICH MX 1.2.7..1 gcc 3.3.3 STATIC 64-bit
+mpich-mx-ibm-32 MPICH MX xlc/xlf 32bit
+mpich-mx-ibm-64 MPICH MX xlc/xlf 64bit
...
user123@BigRed:~>
The key mpich-mx-ibm-32, for example, indicates that it is an MPICH library compiled with the IBM XL compiler in 32-bit mode and that it communicates with the mx protocol. There are also OpenMPI installed. The list above has been shortened with ... since it was very long.
To use MPICH-MX, add the MPICH-MX key to softenv. Add +mpich-mx-ibm-32 or +mpich-mx-ibm-64 to your ~/.soft file, depending on whether you need a 32-bit or a 64-bit MPICH library. Your ~/.soft file should look similar to this:
user123@BigRed:~> less .soft # # This is the .soft file. # It is used to customize your environment by setting up environment # variables such as PATH and MANPATH. # To learn what can be in this file, use 'man softenv'. # # #+mpich-mx-ibm-64 +mpich-mx-ibm-32 @bigred @teragrid-basic @teragrid-dev ## Some more keys you may have at Black/Big Red user123@BigRed:~>
Remember to run resoft after making changes to the .soft file.
MPICH2 (and MPICH) is available for some compiler combinations on Black. Refer to the compilers section for an overview of how to link in MPICH2 support. Here are some more documentation from other sources on the MPICH2 and MPICH libraries:
Intel Math Kernel Library (MKL) contains ScaLAPACK, LAPACK, Sparse Solver, BLAS, Sparse BLAS, CBLAS, GMP, FFTs, DFTs, VSL, VML, and Interval Arithmetic routines. MKL can be found in the directory "/opt/intel/mkl/9.1" and it is divided into the following subdirectory structure:
Here are some example combinations of linking options:
(static linking of LAPACK and Kernels)
$ <fortran_compiler> myprogram.f -L${MKLPATH} -lmkl_lapack -lmkl_ia32 -lguide -lpthread
(static linking of Fortran-95 LAPACK Interface and Kernels)
$ <fortran_compiler> myprogram.f95 -L${MKLPATH} -lmkl_lapack95 -lmkl_lapack -lmkl_ia32 -lguide -lpthread
(static linking of BLAS, Sparse BLAS, GMP, VML/VSL, Interval Arithmetic, and FFT/DFT)
$ <c_compiler> myprogram.c -L${MKLPATH} -lmkl_ia32 -lguide -lpthread -lm
(dynamic linking of BLAS or FFTs)
$ <c_compiler> myprogram.c -L${MKLPATH} -lmkl -lguide -lpthread
It is recommended that you use dynamic linking of libguide. If so, ensure LD_LIBRARY_PATH is defined such that the correct version of libguide is found and used at run time. If you use static linking of libguide (discouraged), then:
Here are some more documentation from other sources on the Intel MKL:
If the source file ends with .F, .fpp, or .FPP, it is automatically preprocessed by cpp before it is compiled. If you want to use the C preprocessor with source files that do not end with .F, use the following compiler option to specify the filename suffix:
GNU: -x f77-cpp-input
Note that the preprocessing is not extended to the contents of files included by the "INCLUDE" directive - the #include preprocessor directive must be used instead.
For example, to preprocess source files that end with .f:
gfortran -x f77-cpp-input program.f
Intel: -cpp
To tell the compiler to link using C++ runtime libraries included with gcc/icc, use -cxxlib -gcc/-cxxlib -icc.
For example, to preprocess source files that end with .f:
ifort -cpp program.f
Generally, it is best to rename the file from <name>.f to <name>.F. The preprocessor will then be run automatically when the file is compiled.
A good page to look at for combining C/C++ and Fortran, is Using C/C++ and Fortran together.
When calling your own Fortran routines from C/C++, you should not append an underscore (_) after the name.
A complete list of routines is in the XL Fortran Language Reference Manual.
Here are some links to pages that discuss how to use Fortran from C/C++:
There are two methods for submitting jobs to Black - to the job queue through loadleveler, and with Condor from Gray. Here we will look at submitting the jobs through loadleveler. These jobs may be serial, message-passing, or shared-memory in nature. As well, very small programs can just be run as normal. If you are running anything but the smallest program, you should submit it to the job queue.
To submit jobs to the job queue, you need to use the program 'Loadleveler'. If you are used to using PBS, then there is a manual here about migrating from PBS to Loadleveler. To control job management, there is the program Moab. Click here for a user's manual to Moab.
Another good page for using Loadleveler is this.
Hold a job temporarily:
llhold [job_id]
Resume job on hold:
llhold -r [job_id]
Potential problem for interactive jobs. If your login shell is csh or tcsh, the X11 server managing your display may not receive the correct X11 authority information (protocol and key-data) from xterm in this context. In that case you will have to open the server to the world by issuing the command:
xhost +
References
If you have been using the PBS system to submit jobs, then this section should help you get started with LoadLeveler.
Common commands
| PBS command | LL command | |
|---|---|---|
| Job submission | qsub <jobscript> | llsubmit <jobscript> |
| Job cancel | qdel <job id>] | llcancel <job id> |
| Job status | qstat -u <username> | llq -u <username> |
| Extended job status | qstat -f <ob id> | llq -l <job id> |
| Hold job (temporarily) | qhold <job id> | llhold <job id> |
| Resume job on hold | qrls <job id> | llhold -r <job id> |
| List usable queues | qstat -Q | llclass |
| Extended list of queues | qstat -Qf | llclass -l |
Environment variables
| PBS command | LL command | |
| Job ID | $PBS_JOBID | $LOADL_STEP_ID |
| Submission directory | $PBS_O_WORKDIR | $LOADL_STEP_INITDIR |
| Node/cpu list | $PBS_NODEFILE | $LOADL_PROCESSOR_LIST |
Resource specifications
| PBS command | LL command | |
|---|---|---|
| Nodes/'chunks' | #PBS -l select=<# nodes> | #@ node=<# nodes> |
| Processors | #PBS -l ncpus=<# cpus> | #@ tasks_per_node=<# tasks> |
| Wall clock limit | #PBS -l walltime=[hh:mm:ss] | #@ wall_clock_limit=[hh:mm:ss] |
| Standard output file | #PBS -o <output filename> | #@ output=<output filename> |
| Standard error file | #PBS -e <error filename> | #@ error=<error filename> |
| Queue | #PBS -q <queue> | #@ class=<queue> |
| Transfer environment | #PBS -V | #@ environment=COPY_ALL |
| Send email to | #PBS -M <email> | #@ notify_user=<email> |
| Job name | #PBS -N <name> | #@ job_name=<name> |
Common Moab scheduler commands
| Show running/queued jobs | showq | less |
| Check job status | checkjob <job id> OR checkjob -v <job id> |
| Show assumed start time | showstart <job id> |
| Show fairshare information | diagnose -f | less |
| Check node status | checknode <nodename> |
| Show reservations | showres |
Much of the information in this section comes from IU's page: http://rc.uits.iu.edu/kb/index.php?kbID=avgl.
The table below shows the queues which are available to Purdue users of Big Red via Black.
| Name of queue | Default nodes | Max nodes | Wall clock limit (default/max) | Job CPU limit (default/max) | Maximum slots | Comments |
|---|---|---|---|---|---|---|
| PU_LOW | 1 | 16 | 02:00:00/07:00:00 | 07:00:00/112:00:00 | 256 | Low queue for Purdue |
| PU_MED | 1 | 64 | 02:00:00/07:00:00 | 07:00:00/448:00:00 | 256 | Med queue for Purdue |
| PU_HIGH | 1 | 128 | 02:00:00/07:00:00 | 07:00:00/1792:00:00 | 512 | High queue for Purdue |
| PU_WIDE | 1 | 256 | 02:00:00/07:00:00 | 07:00:00/896:00:00 | 1024 | Wide queue for Purdue |
To see all queues, type llclass or llclass -l for more information. Here are an example:
user123@BigRed:~> llclass
Name MaxJobCPU MaxProcCPU Free Max Description
d+hh:mm:ss d+hh:mm:ss Slots Slots
--------------- -------------- -------------- ----- ----- ---------------------
DEBUG 04:00:00 00:15:00 16 16 Fast Debug Queue
FAST 04:00:00 00:15:00 16 16 Fast Debug Queue
PU_LOW 112+00:00:00 7+00:00:00 228 256 Low queue for Purdue
PU_MED 448+00:00:00 7+00:00:00 256 256 Med queue for Purdue
PU_HIGH 896+00:00:00 7+00:00:00 256 512 High queue for Purdue
PU_WIDE 1792+00:00:00 7+00:00:00 740 1024 Wide queue for Purdue
LONG 1792+00:00:00 14+00:00:00 167 1456 Intermediate Queue for up to 32 nodes
MED 1792+00:00:00 14+00:00:00 167 1456 Intermediate Queue for up to 32 nodes
NORMAL 2048+00:00:00 2+00:00:00 288 1564 Big Queue for up to 256 nodes
BIG 2048+00:00:00 2+00:00:00 288 1564 Big Queue for up to 256 nodes
--------------------------------------------------------------------------------
"Maximum Slots" value of the class "FAST" is constrained by the MAX_STARTERS limit(s).
"Free Slots" values of the classes "FAST", "PU_LOW", "PU_WIDE", "LONG", "MED",
"NORMAL", "BIG" are constrained by the MAX_STARTERS limit(s).
user123@BigRed:~>
Example, Loadleveler Submission Script
# Specify which shell to use, will use owner's shell if none is given. # @ shell = bash # Specify job type, default is serial (string). For an OpenMP or MPI job, use parallel. # @ job_type = parallel # Specify environment, COPY_ALL means all environment variables from your shell are copied. # @ environment = COPY_ALL # You can specify whether or not to have Loadleveler send you mail. (always|error|start|never|complete) # @ notification = complete # Specify the name of the queue (class). # @ class = PU_LOW # If you are charging to a project, specify the account name with this. # @ account_no = abc # Number of nodes to request (only for parallel programs). # @ node = 4 # Number of tasks per node for MPI programs. For OpenMP or serial programs, take 1 or # omit command for default of 1 task. # @ tasks_per_node = 4 # Specify memory requirements (in MB). # @ requirements=(Memory >= 1024) # Sets the limit for the time a job can run. Default is 30 min (00:30:00). # @ wall_clock_limit = 00:10:00 # Change to directory that job was submitted from, same as cd $PBS_O_WORKDIR - alternatively, # just specify the full path for the program to run. cd $LOADL_STEP_INITDIR # For an OpenMP program, remember to set OMP_NUM_THREADS if you haven't exported your # environment and set it there. The example below is for bash and asks for 4 threads. set OMP_NUM_THREADS 4 # The program to run. Give the full path unless you specify to change to directory job was # submitted from and was standing in said directory when submitting. # @ job_name = ./omp_hello # Specify the name of the output file. # @ output = out.$(jobid) # Specify the name of the file to write any error to. # @ error = $(jobid).$(stepid).err # Tell the system to put a copy of the job in the queue. # @ queue
Job Command File Keywords Reference
These are the keywords you can use in a LoadLeveler job command file.
account_no arguments checkpoint class comment core_limit cpu_limit data_limit dependency environment error executable file_limit group hold image_size initialdir input job_cpu_limit job_name job_type max_processors min_processors notification notify_user output parallel_path preferences queue restart requirements rss_limit shell stack_limit startdate stepname user_priority wall_clock_limit
The command to submit the job submission file is the following:
user123@BigRed:~> llsubmit jobscript llsubmit: Processed command file through Submit Filter: "/home/loadl/scripts/submit_filter.pl". llsubmit: The job "s10c2b5.dim.826218" has been submitted. user123@BigRed:~>
Checking job status (for user):
llq -u [username]
Example
user123@BigRed:~> llq -u user123 Id Owner Submitted ST PRI Class Running On ------------------------ ---------- ----------- -- --- ------------ ----------- s10c2b5.826218.0 user123 12/3 14:06 I 50 PU_LOW 1 job step(s) in query, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted user123@BigRed:~>
Extended job status:
llq -l [job_id]
Deleting the job:
llcancel [job_id]
To run an interactive session under LoadLeveler, you have to create a LoadLeveler job submission file for interactive use. All interactive parallel jobs must use a LoadLeveler job command file. This file contains a number of LoadLeveler keyword statements which specify the various requirements of the interactive job. The script looks very similar to the usual batch/background job submission file.
Batch jobs do not need to specify a job class, but interactive jobs do. This is done with
#@ class = [class_name]
You can see the available classes with llclass:
bbrydsx@BigRed:~> llclass
Name MaxJobCPU MaxProcCPU Free Max Description
d+hh:mm:ss d+hh:mm:ss Slots Slots
--------------- -------------- -------------- ----- ----- ---------------------
DEBUG 04:00:00 00:15:00 16 16 Fast Debug Queue
ADMIN 1+08:00:00 02:00:00 0 0 Admin Queue
SERIAL 8+00:00:00 2+00:00:00 174 3036 Serial backfill queue
PU_LOW 112+00:00:00 7+00:00:00 173 256 Low queue for Purdue
PU_MED 448+00:00:00 7+00:00:00 160 256 Med queue for Purdue
PU_HIGH 896+00:00:00 7+00:00:00 96 512 High queue for Purdue
PU_WIDE 1792+00:00:00 7+00:00:00 429 1024 Wide queue for Purdue
IEDC 1792+00:00:00 14+00:00:00 551 2048 IEDC Queue
LONG 1792+00:00:00 14+00:00:00 111 1216 Intermediate Queue for up to 32 nodes
SPRUCE 1792+00:00:00 14+00:00:00 16 16 SPRUCE queue
NORMAL 2048+00:00:00 2+00:00:00 63 1820 Big Queue for up to 256 nodes
--------------------------------------------------------------------------------
"Maximum Slots" value of the class "ADMIN" is constrained by the MAX_STARTERS limit(s).
"Free Slots" values of the classes "ADMIN", "SERIAL", "PU_LOW", "PU_MED", "PU_HIGH",
"PU_WIDE", "IEDC", "LONG", "NORMAL" are constrained by the MAX_STARTERS limit(s).
bbrydsx@BigRed:~>
For interactive jobs #@ node_usage = shared should be specified.
Here is an example of a job submission file to run an interactive session (serial job). Since it opens a xterm, you must make sure that the display is set properly to ensure that X-windows programs will be allowed access to your display.
#@ output = $(job_name).out # #@ error = $(job_name).err # #@ job_type = serial # #@ class = PU_LOW # #@ notification = never # #@ node_usage = shared # #@ environment = COPY_ALL # #@ executable = /usr/bin/xterm # #@ arguments = -ls -sb -sl 300 # #@ queue
If you want to run a MPI/parallel job, set job_type = parallel and cpus = [wanted number of cpus]. You can set walltime with wall_clock_limit = hh:mm:ss.
A script should be submitted with the following command:
llsubmit [job_script]
You can run emacs with scripts like the above, using #@ executable = /usr/bin/emacs.
Getting the display to work:
As mentioned, you must have a workstation with X11 server and X11 authority running for that to work. You also need to define the display itself. The value of the DISPLAY will be passed on to xterm, because we use:
# @ environment = COPY_ALL
To find the values to set for your display to work, issue the command xauth list on your local term:
Example
user123@BigRed:~> xauth list s10c2b11/unix:10 MIT-MAGIC-COOKIE-1 4f4bcf417d9d84592458f02e88eae05b s10c2b12/unix:10 MIT-MAGIC-COOKIE-1 0a2361ad5d2f5f41e15f1d600cf5d3d3 user123@BigRed:~>
Select the whole first line of the listing and switch to an xterm on Black/BigRed. There you should type
xauth add s10c2b12/unix:10 MIT-MAGIC-COOKIE-1 0a2361ad5d2f5f41e15f1d600cf5d3d3
Change to your own values, of course!
Then define the display itself:
export DISPLAY=s10c2b12/unix:10
and you can submit the job with
llsubmit [jobscript]
To test that it actually works, you can try opening xclock:
xclock -display s10c2b12/unix:10.0 &
More information can be found here: http://beige.ucs.indiana.edu/gustav/ll-hints.html#interactive and here: http://beige.ucs.indiana.edu/B673/node93.html
In these examples I will use this simple script
#@ output = $(job_id).out #@ error = $(job_id).err #@ job_type = serial #@ class = PU_LOW #@ notification = never #@ executable = /N/u/user123/BigRed/hello #@ environment = COPY_ALL #@ queue
To just submit the above job submission file (called 'hello_script') issue the following command
user123@BigRed:~> llsubmit hello_script llsubmit: Processed command file through Submit Filter: "/home/loadl/scripts/submit_filter.pl". llsubmit: The job "s10c2b5.dim.581935" has been submitted. user123@BigRed:~>
Note that job submission file may have been called job script at other sites.
Then, after a little while, you will get one or two new files in your directory
user123@BigRed:~> ls hello hello_script hello.c s10c2b5.dim.581935.out user123@BigRed:~>
Note that the corresponding .err file will only be created if there actually were errors from the run.
You can now look at the results
user123@BigRed:~> less s10c2b5.dim.581935.out Hello World! user123@BigRed:~>
Obtain information about jobs in the queue
user123@BigRed:~> llq Id Owner Submitted ST PRI Class Running On ------------------------ ---------- ----------- -- --- ------------ ----------- s10c2b5.560864.0 user456 7/29 16:42 R 50 PU_WIDE s19c2b6 s10c2b5.560866.0 user456 7/29 16:42 R 50 PU_WIDE s19c3b13 s10c2b5.560868.0 user456 7/29 16:43 R 50 PU_WIDE s19c3b8 s10c2b5.560871.0 user456 7/29 16:43 R 50 PU_WIDE s19c4b2 s10c2b5.560872.0 user456 7/29 16:43 R 50 PU_WIDE s20c1b1 ... s10c2b5.567034.0 user789 7/31 11:17 R 50 PU_LOW s16c3b7 s10c2b5.567035.0 user789 7/31 11:17 R 50 PU_LOW s16c3b8 s10c2b5.567036.0 user789 7/31 11:17 R 50 PU_LOW s16c3b9 1339 job step(s) in queue, 952 waiting, 0 pending, 386 running, 1 held, 0 preempted
llq -l will display a much longer and more detailed list.
user123@BigRed:~> llsubmit hello_script llsubmit: Processed command file through Submit Filter: "/home/loadl/scripts/submit_filter.pl". llsubmit: The job "s10c2b5.dim.581944" has been submitted. user123@BigRed:~> llq -u user123 Id Owner Submitted ST PRI Class Running On ------------------------ ---------- ----------- -- --- ------------ ----------- s10c2b5.581944.0 user123 8/11 15:09 I 50 PU_LOW 1 job step(s) in query, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted user123@BigRed:~> llhold s10c2b5.581944.0 llhold: Hold command has been sent to the central manager. user123@BigRed:~> llq -u user123 Id Owner Submitted ST PRI Class Running On ------------------------ ---------- ----------- -- --- ------------ ----------- s10c2b5.581944.0 user123 8/11 15:09 H 50 PU_LOW 1 job step(s) in query, 0 waiting, 0 pending, 0 running, 1 held, 0 preempted user123@BigRed:~> llhold -r s10c2b5.581944.0 llhold: Hold command has been sent to the central manager. user123@BigRed:~> llq -u user123 Id Owner Submitted ST PRI Class Running On ------------------------ ---------- ----------- -- --- ------------ ----------- s10c2b5.581944.0 user123 8/11 15:09 I 50 PU_LOW 1 job step(s) in query, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted user123@BigRed:~>
bbrydsx@BigRed:~> llstatus Name Schedd InQ Act Startd Run LdAvg Idle Arch OpSys s10c1b1.dim Down 0 0 Idle 0 0.00 9999 PPC64 Linux2 s10c1b2.dim Down 0 0 Idle 0 0.00 9999 PPC64 Linux2 s10c1b3.dim Down 0 0 Idle 0 0.00 9999 PPC64 Linux2 s10c1b4.dim Down 0 0 Idle 0 0.00 9999 PPC64 Linux2 s10c2b5.dim Avail 1332 379 None 0 0.07 9999 PPC64 Linux2 s11c1b1.dim Down 0 0 Busy 4 3.61 9999 PPC64 Linux2 s11c1b10.dim Down 0 0 Busy 4 4.74 9999 PPC64 Linux2 s11c1b11.dim Down 0 0 Busy 4 3.49 9999 PPC64 Linux2 s11c1b12.dim Down 0 0 Busy 4 4.40 9999 PPC64 Linux2 ... s9c4b6.dim Down 0 0 Busy 4 5.65 9999 PPC64 Linux2 s9c4b7.dim Down 0 0 Busy 4 4.62 9999 PPC64 Linux2 s9c4b8.dim Down 0 0 Busy 4 9.50 9999 PPC64 Linux2 s9c4b9.dim Down 0 0 Busy 4 4.67 9999 PPC64 Linux2 PPC64/Linux2 1021 machines 1332 jobs 3237 running tasks Total Machines 1021 machines 1332 jobs 3237 running tasks The Central Manager is defined on s10c2b5.dim The API scheduler is in use The following machines are marked SUBMIT_ONLY s10c2b1.dim s10c2b2.dim s10c2b3.dim s10c2b4.dim s10c2b6.dim The following 4 machines are marked absent s10c1b5.dim s10c1b6.dim s10c1b7.dim s10c1b8.dim
This will give only the memory
llstatus -l |grep -E "Machine|Memory"
user123@BigRed:~> llsubmit hello_script llsubmit: Processed command file through Submit Filter: "/home/loadl/scripts/submit_filter.pl". llsubmit: The job "s10c2b5.dim.581947" has been submitted. user123@BigRed:~> llq -u bbrydsx Id Owner Submitted ST PRI Class Running On ------------------------ ---------- ----------- -- --- ------------ ----------- s10c2b5.581947.0 user123 8/11 15:13 I 50 PU_LOW 1 job step(s) in query, 1 waiting, 0 pending, 0 running, 0 held, 0 preempted user123@BigRed:~> llcancel s10c2b5.581947.0 llcancel: Cancel command has been sent to the central manager. user123@BigRed:~> llq -u user123 Id Owner Submitted ST PRI Class Running On ------------------------ ---------- ----------- -- --- ------------ ----------- s10c2b5.581947.0 user123 8/11 15:13 CA 50 PU_LOW 0 job step(s) in query, 0 waiting, 0 pending, 0 running, 0 held, 0 preempted user123@BigRed:~>
You can submit multiple jobs from a single script by using several 'queue' statements. Note that LoadLeveler statements in effect for the first job are generally in effect for all subsequent jobs in the same job command file, unless overridden later.
Submitting jobs this way is very useful if you want to run the same executable with different input/output files. Here is an example of that.
#@ executable = ./myprogram # #@ input = myprogram.input_1 #@ output = myprogram.out_1 #@ error = myprogram.err_1 #@ queue # #@ input = myprogram.input_2 #@ output = myprogram.out_2 #@ error = myprogram.err_2 #@ queue
Same as above, but using predefined LoadLeveler macros to generate different output files. Five jobs will be queued, each of which reads a unique input file and creates unique output and error files.
#@ executable = myprogram # #@ input = myprogram.in.$(Process) #@ output = myprogram.out.$(Cluster).$(Process) #@ error = myprogram.err.$(Cluster).$(Process) #@ queue #@ queue #@ queue #@ queue #@ queue
#!/bin/csh # # LoadLeveler commands #@ initialdir = /N/u/user123/BigRed #@ error = run1.$(Cluster).err #@ output = run1.$(Cluster).out #@ environment = MP_SHARED_MEMORY=yes #@ queue # Script commands echo 'Copying input file to /scratch' cp input.1 /scratch/input.1 echo 'Running the program' myprogram echo 'Copying output file back' cp /scratch/output.1 output.1 rm /scratch/input.1 echo 'Cleanup done. Job completed.' end
#@ output = outfile.out #@ error = errorfile.err # #@ job_type = serial # #@ class = PU_LOW # #@ notification = never # #@ environment = COPY_ALL # #@ executable = /N/u/user123/BigRed/hello # #@ queue
Important: You must assign values to "output" and "error" if your program writes to stdout and/or stderr. If not specified, these default to /dev/null.
If you want to make sure that the output/error from each job goes to a separate file, you can use the values assigned by LoadLeveler to the Executable, Cluster, and Process values.
Cluster: unique jobid
Process: assigned to each process queued within a script
This job submission file example executes a serial job twice, giving each a different output filename.
#@ output = $(Executable).$(Cluster).$(Process).out #@ error = $(Executable).$(Cluster).$(Process).err # #@ job_type = serial # #@ class = PU_LOW # #@ notification = never # #@ environment = COPY_ALL # #@ executable = /N/u/user123/BigRed/myprogram # #@ arguments = args1 args2 args3 #@ queue # #@ arguments = args4 args5 args6 #@ queue
If no executable is specified, LoadLeveler will asume that anything following the #@ queue statement consists of commands to be executed. This can among other things be used to run shell commands, or if you need to run several commands in sequence. Here is an example:
#@ output = myprogram.$(Cluster).out #@ error = myprogram.$(Cluster).err # #@ job_type = serial # #@ class = PU_LOW # #@ notification = never # #@ environment = COPY_ALL # #@ queue xlc /N/u/user123/BigRed/hello.c -o /N/u/user123/BigRed/hello /N/u/user123/BigRed/hello rm /N/u/user123/BigRed/a.out
The following script submits a 4-CPU OpenMP parallel job. Note that although you specify the number of CPUs to reserve using the tasks_per_node keyword, your OpenMP program will spawn only the number of threads specified by the value of the OMP_NUM_THREADS environment variable, so you must set the value in the batch script. This is done like this:
In Tcsh/Csh:
setenv OMP_NUM_THREADS=<number of threads>
In Bash:
export OMP_NUM_THREADS=<number of threads>
#@ output = $(Executable).$(Cluster).output #@ error = $(Executable).$(Cluster).error #@ class = PU_LOW # # specify job type, default is serial (string). For an OpenMP or MPI job, use parallel #@ job_type = parallel # # number of nodes to request (only for parallel programs) #@ node = 4 # number of tasks per node for MPI programs. For OpenMP or serial programs, take 1 or # omit command for default of 1 task #@ tasks_per_node = 1 # # specify memory requirements (in kb) # @ requirements=1000000 # # wall_clock_limit sets the limit for the time a job can run. Default is 30 min (00:30:00). # @ wall_clock_limit = 00:10:00 # # Change to directory that job was submitted from, same as cd $PBS_O_WORKDIR - alternatively, just # specify the full path for the program to run cd $LOADL_STEP_INITDIR # # The program to run. Give the full path unless you specify to change to directory job was submitted # from and was standing in said directory when submitting # @ job_name = ./omp_hello # # specify environment, COPY_ALL means all environment variables from your shell are copied #@ environment = COPY_ALL; OMP_NUM_THREADS=4 # #@ queue
Then submit the job with
llsubmit openmp_script
Job submission file and running MPI jobs
To submit a parallel/MPI job, you have to add #@ job_type = parallel to the job submission file. There are a number of other parameters which can/should be set:
To determine which nodes were used for your parallel execution, add
echo $LOADL_PROCESSOR_LIST > myhosts
to your job submission file or ask to have mail sent to you. It will include the nodes used. Turn it on by adding this to your job submission file
#@ notification = complete #@ notify_user = user@address.domain
OR set the MP_INFOLEVEL environment variable to a value above 1 and look in the file were standard error is written
#@ error = myjob.err #@ environment = MP_INFOLEVEL=2
Example MPI job submission file
# #@ error = myprogram.$(Cluster).$(process).err #@ output = myprogram.$(Cluster).$(process).out # # @ job_type = MPICH # @ account_no = NONE # #@ notification = complete #@ notify_user = user123@address.domain # #@ class = PU_MED # # @ environment=COPY_ALL; # #@ node = 4 #@ tasks_per_node = 4 # #@ wall_clock_limit= 15:00:00 #@ queue # ## Users should always cd into their execution directory due to ## a bug within LoadLeveler in dealing with the initialdir keyword. cd [execution directory] ## Use mpirun to execute your MPICH program in parallel; ## $LOADL_TOTAL_TASKS and $LOADL_HOSTFILE are defined by ## LoadLeveler for jobs of type MPICH. mpirun -np $LOADL_TOTAL_TASKS -machinefile $LOADL_HOSTFILE ./myprogram
The above job submission file asks for 4 nodes, and 4 tasks on each node. The program is called 'myprogram'.
Submitting Parallel MPI Jobs
To submit a parallel job, you have three options:
paralleljob ./my_parallel_program
cat mfile b509 b510 b509 b510 b509 b510 b509 b510
llsubmit parallel_jobscript.sh
Note If you have multiple #@ environment statements, only the last will have effect. If you need to specify multiple environment variables, separate them by semicolons with a single #@ environment statment. For example:
#@ environment = MP_Shared_MEMORY=yes;MP_INFOLEVEL=3;MP_LABELIO=yes
Also, do not use the #@ executable statement if you are running parallel jobs. Parallel jobs use the job submission file as the executable.
Most of this information was taken from http://rc.uits.iu.edu/kb/index.php?kbID=autn where more information can be found.
Add +R, +R-2.4.1-gcc-32, +R-2.4.1-ibm-32, +R-2.5.0-ibm-64, or +R-2.6.0 to your .soft-file and do a resoft before running your program.
The following script (Rjob.sh) submits a R job
#@ class = serial #@ error = myRjob.err #@ output = myRjob.out #@ input = myRjob.in #@ queue cd [execution_dir] R
Submit with
llsubmit Rjob.sh
There are currently no FAQs for Black.