The Gray cluster is solely a development platform to be used alongside the Indiana Economic Development Corporation (IEDC) machine Black. Gray is a place to compile code (mostly serial Condor applications) that is to be run on Black. Black is currently housed with Indiana University's "Big Red" system in Bloomington, Indiana. However, Gray is located on Purdue's West Lafayette campus. Gray includes a front-end server, several worker-node blades, and extra front-end hosts for campus and TeraGrid Condor users.
Gray nodes feature the same architecture as Black and Indiana University's "Big Red" for compatibility when compiling code. All are IBM JS21 Bladeservers with 8 GB of memory.
| Number of Nodes | Processor | Cores per Node | Memory per Node | Interconnect | TeraFlops |
|---|---|---|---|---|---|
| 4 | Dual-Processor 2.3 GHz Dual-Core PowerPC 970MP | 4 | 8 GB | Gigabit Ethernet | 0.07 |
All Gray nodes run SuSE Linux Enterprise Server 9. There is no job scheduling system, as Gray is to be used only for source code compilation. Operating system patches are applied monthly or as security needs dictate.
Gray is a cluster operated by RCAC. Purdue faculty, staff, and students with the approval of their advisor may request access to Gray using the online Research Computing Account Request Form.
To log in to Gray, users may log in to the front-end host gray.rcac.purdue.edu via SSH.
All access to the RCAC systems must be through secure (encrypted) connections. Standard telnet and FTP are not supported. SSH, SCP, and SFTP may be used instead.
Secure Shell or SSH is a way of establishing a secure channel between a local and a remote computer. It uses public-key cryptography to authenticate the remote computer and (optionally) to allow the remote computer to authenticate the user. It is usually used to log in to a remote machine and execute commands similar to telnet, but it also supports tunneling and forwarding of X11 or arbitrary TCP connections. The associated SFTP and SCP protocols may be used to transfer files. There are many SSH clients available, depending on the operating system you use.
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Mac OS X:
SSH can be used in conjunction with many different means of authentication. One popular authentication method is called Public Key Authentication (PKA). PKA is a method of establishing your identity to a remote computer using related sets of encryption data called keys. PKA is a more secure alternative to traditional password-based authentication with which you are probably familiar.
To employ PKA via SSH, you manually generate a keypair (also called SSH keys) in the location from where you wish to initiate a connection to a remote machine. This keypair consists of two text files, one which is called a private key and one which is called a public key. You keep the private key file confidential on your local machine or local home directory (hence the name "private" key). You then login to a remote machine (if possible) and append the corresponding public key text to the end of a specific file, or have a system administrator do so on your behalf. In future login attempts, the public and private keys are compared to verify your identity, which then grants you access to the remote machine.
As a user, you can create, maintain, and employ as many keypairs as you wish. If you connect to a computational resource from your work laptop, your work desktop, and your home desktop, you can create and employ keypairs on each. You can also create multiple keypairs on a single local machine to serve different purposes, such as establishing access to different remote machines, or establishing different types of access to a single remote machine. In short, PKA via SSH offers a secure but flexible means of identifying yourself to all kinds computational resources.
When a you create a keypair, you are prompted to provide a passphrase for the private key. This passphrase is different than a password in a number of ways. First, a passphrase is, as the name implies, a phrase. It can include most types of characters, including spaces, and has no limits on length. Second, this passphrase is not transmitted to the remote machine for verification. It is used only to allow the use of your local private key and is specific to a specific local private key.
Perhaps you are wondering why you would need a private key passphrase at all when using PKA. If the private key is kept secure, why the need for a passphrase just to use it? Indeed, if the location of your private keys were always completely secure, a passphrase might not be needed. In reality, a number of situations could arise in which someone may improperly gain access to your private key files. In these situations, a passphrase offers another level of security for you, the user who created the keypair.
Think of the private key/passphrase combination as being analogous to your ATM card/PIN combination. The ATM card itself is the object that grants access to your important accounts, and as such, should be kept secure at all times—just as a private key should. But if you ever lose your wallet or your ATM card is stolen, you are glad that your PIN exists to offer you another level of protection. The same is true for a private key passphrase.
When you create a keypair, you should always provide a corresponding private key passphrase. For security purposes, avoid using phrases that would be guessed by automated programs (e.g. phrases that consist solely of words in English-language dictionaries). This passphrase can never be recovered if forgotten, so make note of it. There are only limited situations when the use of a non-passphrase-protected private key is warranted—conducting automated file backups is one such situation. If you need to use a non-passphrase-protected private key to conduct automated backups to Fortress, see the No-Passphrase SSH Keys section.
If you have received a default password as part of the process of obtaining your account, you should change it immediately when you log on for the first time. This can be done from any terminal/SSH session with the command "passwd". You will have the same password on all RCAC systems. If you change your password on any one RCAC system, it will change on all RCAC systems.
If you already have a Purdue career account, then you will initially be given the same userid and password as your career account. There is no need to change your career account password because you have received an account on RCAC systems.
There is not currently any requirement regarding how often you must change your password within RCAC, but for security reasons changing a password every six months, preferably every three months, is good practice.
All passwords should:
Never share your password with another user or make your password known to anyone else. Systems staff will NEVER ask for your password, by email or otherwise.
File storage options on RCAC systems include home directories, scratch file systems, /tmp, and long-term or permanent storage. Each of these have different performance and intended uses, and some vary from system to system as well. Home directories and long-term storage are backed up nightly, but scratch and /tmp are not and may be occasionally purged without warning. Below is more detail about each of these storage options.
Your home directory is the default directory you are placed in when you log in.
You should use this space for storing files you want to keep long term such as source code, scripts, input data sets, etc. It should also be used for files you want to keep and which you use often. The home directory will physically reside on the BlueArc NFS Server. You can find the path to your home directory by logging in, and typing pwd:
$ pwd /home/ba01/u103/myusername
The second component of the reply indicates the name of the host where your home directory physically resides. In this example, the home directory is on the RCAC home directory file server named "ba01" under area "u103". This will vary from person to person. Remember, you can always check where your home directory is located by doing a pwd command in your home directory.
Regardless of its physical location, your home directory and its contents are available on almost all the RCAC front-end hosts and their nodes via the Network File System (NFS). The only exception is Black.
Note that your home directory has a quota capping the size and/or number of files you may store within. For more information, refer to the Storage Quotas / Limits Section.
Less than 7 days ago
The Rosen Center has implemented self-service file recoveries. To recover accidentally deleted files yourself, simply log into any RCAC system and cd to /autohome/ba01_snap/backup_snap/u1XX/username, which will provide a read-only copy of your home directory. Note that u1XX should be changed to one of u100-u114, depending on which of these your home directory is located on. Use the command 'pwd' in your home directory to find the right one. You should then be able to copy the file back into its original location by using a command like
cp -r ./* ~username
If you need to retrieve files from an earlier snapshot, do 'ls /autohome/ba01_snap/' and you will see directories with names like backup_snap_YYYYMMDDHHSS. The "backup_snap" directory contains the most recent home file system snapshots and these other directories contain daily snapshots for the last 7 days. So, if you wanted to retrieve files from the snapshot taken on, say 9/24, rather than last night's, you could cd to /autohome/backup_snap_20080924000102/u1XX/username (provided less than 7 days had passed since that date), and then cp files and/or whole directories into your home directory or your scratch directory.
Examples
Copying a file from backup to home directory:
$ cd /autohome/ba01_snap/backup_snap /autohome/ba01_snap/backup_snap$ ls $__NDMP__ u100 u101 u102 u103 u104 u105 u106 u107 u108 u109 u110 u111 u112 u113 u114 /autohome/ba01_snap/backup_snap$ cd u1XX/username /autohome/ba01_snap/backup_snap/u1XX/username$ ls C WWW openmp_programs C++ mpi fortran pbsscripts C_Fortran matlab.submit teragrid OpenMP_MPI openMP /autohome/ba01_snap/backup_snap/u1XX/username$ cp matlab.submit ~username /autohome/ba01_snap/backup_snap/u1XX/username$
Copying a directory from backup to scratch:
$ cd /autohome/ba01_snap/backup_snap /autohome/ba01_snap/backup_snap$ ls $__NDMP__ u100 u101 u102 u103 u104 u105 u106 u107 u108 u109 u110 u111 u112 u113 u114 /autohome/ba01_snap/backup_snap$ cd u1XX/username /autohome/ba01_snap/backup_snap/u1XX/username$ ls C WWW openmp_programs C++ mpi fortran pbsscripts C_Fortran matlab.submit teragrid OpenMP_MPI openMP /autohome/ba01_snap/backup_snap/u1XX/username$ cp -r fortran $RCAC_SCRATCH /autohome/ba01_snap/backup_snap/u1XX/username$
More than 7 days ago
If you need to recover the contents of your home from a backup made more than 7 days ago, please request it by running the flost command on one of our front-end hosts, like steele.rcac or radon.rcac.
Note that only files which has been backed up (stayed over the night/around 1am) can be recovered. If you delete a new files minutes or hours after creating it, it cannot be recovered.
Scratch directories are provided by RCAC and are intended for short-term file storage only.
Backups are not performed on scratch directories. In the event of a disk crash or file purge, files in scratch directories can not be recovered. Please be sure to copy any important files to more permanent storage.
All files stored in RCAC scratch directories older than 90 days will be automatically removed (purged). Owners of these files will be notified one week before removal via email. For more information, please refer to our Scratch File Purging Policy.
RCAC scratch directories are provided by a central BlueArc server and are accessible from most RCAC systems. There are two primary scratch file systems: scratch95 and scratch96. A scratch directory already exists for all Gray users. Your RCAC scratch directory is located under scratch95 or scratch96 within a subdirectory by the first letter of your username.
To find the path to your RCAC scratch directory, run myscratch:
$ myscratch /scratch/scratch96/m/myusername
The variable $RCAC_SCRATCH is also set to your RCAC scratch directory path. Use this variable in any scripts. Your actual scratch directory path may change without warning, but this variable will remain current.
$ echo $RCAC_SCRATCH /scratch/scratch96/m/myusername
To find the path to someone else's RCAC scratch directory, use the command findscratch:
$ findscratch someuser /scratch/scratch95/s/someuser
Note that your RCAC scratch directory has a quota capping the size and/or number of files you may store within. For more information, refer to the Storage Quotas / Limits Section.
The /tmp directory is intended for temporary files that are used during the execution of a process or job or while you examine files created by your jobs. Used properly, /tmp may provide faster local storage to an active process than any other storage option. However, do not use it for longer-term storage or critical results.
Files stored in /tmp are not backed up and are removed whenever space is low or whenever the system is rebooted. In the event of a loss, files in /tmp can not be recovered, so use it only for files that can be recreated relatively easily.
Long-term Storage or Permanent Storage is available to RCAC users on the DXUL/UniTree archival storage system, commonly referred to as "Fortress". DXUL (DiskXtender for Unix and Linux) and UniTree are a software package that manages a hierarchical storage system. Program files, data files and any other files which are not used often, but which must be saved, can be put in permanent storage. Fortress currently has a 1.2 PB capacity. However, since two copies are retained for every file, the usable capacity is only 600 TB.
Recently used files smaller than 0.5 MB have their primary copy stored on low-cost disks, but the second copy is on tape or optical disks. This provides a rapid restore time to the disk cache. However, the large latency to access a larger file (usually involving a copy from a tape cartridge) makes it unsuitable for use as active storage.
In addition to poor performance, these two uses can cause severe problems with the system itself:
Do not use Fortress as a second home directory. Instead, use tar or some similar archive tool to combine all the smaller files you wish to store into a single large file first.
For active data storage you should use either local storage or a scratch file system. You may then copy any results you wish to archive to Fortress when computation is complete.
Fortress is directly accessible (via FTP, SSH, SCP, SFTP, and NFS) from all RCAC systems, as well as most systems in ECN and CS and from several other major servers on campus. To access Fortress in any way other than NFS, you must login to fortress.rcac.purdue.edu. RCAC has more information about Fortress, including how to obtain a Fortress account and how to access your files on Fortress.
There are many environment variables related to storage locations and paths which are automatically set for you upon log-in, or may be changed if necessary. In addition, many more environment variables are set for specific applications, such as compilers, when "modules" for these applications are loaded. (See the module command section for more information.)
Use environment variables instead of actual paths whenever possible to avoid problems if the specific paths to any of these change. Some of the environment variables you should have are:
All environment variables begin with the dollar sign ($) and are all uppercase. They may be used on the command line or in any scripts in place of and in combination with hard-coded values:
$ ls $HOME ... $ ls $RCAC_SCRATCH/myproject ... $ ls $RCAC_SCRATCH/myproject/$HOSTNAME_data ...
You may find the value of any environment variable by using the echo command:
$ echo $RCAC_SCRATCH /scratch/scratch95/m/myusername $ echo $SHELL /usr/local/bin/tcsh
You may list the values of all environment variables using the env command:
$ env USER=myusername HOME=/home/ba01/u101/myusername RCAC_SCRATCH=/scratch/scratch95/m/myusername SHELL=/usr/local/bin/tcsh ...
You may create or overwrite an environment variable using either export or setenv, depending on your shell:
(for bash and sh) $ export VARIABLE=value (for tcsh and csh) % setenv VARIABLE value
Your disk usage is limited on RCAC systems. However, each filesystem (scratch, home directory, etc.) may have a different limit. If you exceed the soft limit or quota, you will see warnings whenever writing to the disk that you are over quota, but the write will still succeed. If you exceed the hard limit or limit, your write will fail until you either remove other files or your quota is increased. Generally, RCAC systems do not impose a soft limit—only a hard limit.
You may find out what your current quota is by using the quota command:
$ quota
Disk quotas for user myusername (uid 12345):
Filesystem blocks quota limit grace files quota limit grace
ba01:/u103 2346272 0 5000000 17508 0 65535
The columns are as follows:
You may also see the disk usage of any given directory by using du:
$ du -hs 1.1G . $ du -hs $HOME 138M /home/ba01/u103/myusername
This can be very helpful in figuring out where your largest files or directories are, so that you may clean out unneeded large files and avoid hitting your quota.
If you find you need additional disk space on an RCAC account, please first consider archiving and compressing old files and moving them to long-term storage. If this option does not resolve the issue, you may send an email to rcac-help@purdue.edu and request additional space.
There are several options for archiving and compressing groups of files or directories on RCAC systems. All of the following tools are provided:
(compress file somefile.c) $ zip somefile.zip somefile.c (extract contents of somefile.zip) $ unzip somefile.zip (compress all files in a directory into one archive file) $ zip -r somefile.zip somedirectory/ (compress all ".c" files in current directory into one archive file) $ zip -r somefile.zip . -i \*.c
(archive file somefile.c) $ tar cvf somefile.tar somefile.c (archive and compress file somefile.c) $ tar czvf somefile.tar.gz somefile.c (list contents of archive somefile.tar) $ tar tvf somefile.tar (extract contents of somefile.tar) $ tar xvf somefile.tar (extract contents of gzipped archive somefile.tar.gz) $ tar xzvf somefile.tar.gz (archive and compress all files in a directory into one archive file) $ tar czvf somefile.tar.gz somedirectory/ (archive and compress all ".c" files in current directory into one archive file) $ tar czvf somefile.tar.gz *.c
(compress file somefile - also removes uncompressed file) $ gzip somefile (uncompress file somefile.gz - also removes compressed file) $ gunzip somefile.gz
(compress file somefile - also removes uncompressed file) $ bzip2 somefile (uncompress file somefile.bz2 - also removes compressed file) $ bunzip2 somefile.bz2
Windows users can work with these same formats using some of the following software:
There are a variety of ways to transfer data to and from RCAC systems. Which you should use depends on several factors, including the ease of use for you personally, connection speed and bandwidth, the size and number of files to be transferred. For more details on file transfer methods and applications, refer to the Gray Complete User Guide.
Application software is not provided on Gray. It is intended for use in compiling software for Black only. GNU compilers are automatically included in your path, and IBM Fortran compilers are available under "/opt/ibmcmp/xlf/10.1/bin/".
The "module" command used on most RCAC systems is not available on Gray. GNU compilers are automatically included in your path, and IBM Fortran compilers are available under "/opt/ibmcmp/xlf/10.1/bin/". Any other software is probably installed under "/opt/".
Compilers are available on Gray for Fortran 77, Fortran 90, Fortran 95, C, and C++, but primarily only for serial programs. The compilers can produce general-purpose and architecture-specific optimizations to improve performance. These include loop-level optimizations, inter-procedural analysis and cache optimizations. The compilers support automatic and user-directed parallelization of Fortran, C, and C++ applications for multiprocessing execution. More detailed documentation on each compiler set available on Gray follows.
Compilation of MPI programs on Gray is not possible. If you need to compile MPI programs to run on Black, you will need to compile those on Black.
Here is some more documentation from Indiana University on compilation on Gray:
To use the IBM compiler set on Gray, you need load no modules. However, the compiler programs will not already be in your path. You must add these to you path manually first:
(for bash and sh) $ export PATH=/opt/ibmcmp/xlf/10.1/bin:$PATH (for tcsh and csh) % setenv PATH /opt/ibmcmp/xlf/10.1/bin:$PATH
Here are some examples of how to use the IBM compilers on Gray:
| Serial Program | MPI Program | OpenMP Program | |
|---|---|---|---|
| Fortran77 |
$ xlf_r myprogram.f -o myprogram |
(not available) |
$ xlf_r -qsmp=omp myprogram.f -o myprogram |
| Fortran90 |
$ xlf90_r myprogram.f -o myprogram |
(not available) |
$ xlf90_r -qsmp=omp myprogram.f -o myprogram |
| Fortran95 |
$ xlf95_r myprogram.f -o myprogram |
(not available) |
$ xlf95_r -qsmp=omp myprogram.f -o myprogram |
| C | (not available) | (not available) | (not available) |
| C++ | (not available) | (not available) | (not available) |
More information on compiler options can be found in the official man pages, which can be accessed using the "man" command, or online here:
Here is some more documentation from other sources on the IBM compilers:
To use the GNU compiler set on Gray, you need load no modules. The compiler programs will already be in your path. Here are some examples:
| Serial Program | MPI Program | OpenMP Program | |
|---|---|---|---|
| Fortran77 |
$ gfortran myprogram.f -o myprogram |
(not available) | (not available) |
| Fortran90 |
$ gfortran myprogram.f90 -o myprogram |
(not available) | (not available) |
| Fortran95 |
$ gfortran myprogram.f95 -o myprogram |
(not available) | (not available) |
| C |
$ gcc myprogram.c -o myprogram |
(not available) | (not available) |
| C++ |
$ g++ myprogram.cpp -o myprogram |
(not available) | (not available) |
More information on compiler options can be found in the official man pages, which can be accessed using the "man" command, or online here:
Only the smallest serial/OpenMP jobs can be run on Gray, as there is no job scheduling system. It is solely meant as a compute-host for serial jobs under Condor on Black/BigRed.
Condor allows users to run jobs on systems which would otherwise be idle for however long as those systems are not needed by their primary users. Condor is one of several distributed computing systems RCAC makes available. Most RCAC resources, in addition to being available through normal means, are a part of BoilerGrid and can be used via Condor. If a primary user needs a machine, the Condor job is immediately either checkpointed and/or migrated and the resource made available. Thus, shorter jobs will have a better completion rate via Condor than longer jobs; however, even though jobs may have to be restarted elsewhere, BoilerGrid can offer a vast amount of computational resources to users. Not only are nearly all RCAC systems part of BoilerGrid, so also are large numbers of lab machines at the West Lafayette and other Purdue campuses. BoilerGrid is one of the largest Condor pools in the world. Some machines at other institutions are also a part of a larger Condor federation known as DiaGrid and can be used as well. For more information, refer to the BoilerGrid documentation.