This document follows certain typesetting and naming conventions:
$ example This is an example of commands and output.
Radon is a compute cluster operated by ITaP for general campus use. Radon consists of 24 64-bit, 8-core Dell 1950 systems with 16 GB RAM, 160 GB of disk, and 1 Gigabit Ethernet (1GigE) local to each node.
Radon consists of one logical sub-cluster "D". The nodes are 2.33 GHz quad-core Intel E5410 CPUs, 16 GB RAM, and 1 Gigabit Ethernet.
| Sub-Cluster | Number of Nodes | Processors per Node | Cores per Node | Memory per Node | Interconnect | Disk | Theoretical Peak TeraFLOPS |
|---|---|---|---|---|---|---|---|
| Radon-D | 30 | Two 2.33 GHz Quad-Core Intel E5410 | 8 | 16 GB | 1 GigE | 160 GB | 58.2 |
Radon nodes run Red Hat Enterprise Linux 5 (RHEL5) and use Moab Workload Manager 6 and TORQUE Resource Manager 3 as the portable batch system (PBS) for resource and job management. Radon also runs jobs for BoilerGrid whenever processor cores in it would otherwise be idle. The application of operating system patches occurs as security needs dictate. All nodes allow for unlimited stack usage, as well as unlimited core dump size (though disk space and server quotas may still be a limiting factor).
For more information about the TORQUE Resource Manager:
On Radon, ITaP recommends the following set of compiler, math library, and message-passing library for parallel code:
To load the recommended set:
$ module load devel
To verify what you loaded:
$ module list
The system interconnect is the networking technology that connects nodes of a cluster to each other. This is often much faster and sometimes radically different from the networking available between a resource and other machines or the outside world. Interconnects have different characteristics that may affect parallel message-passing programs and their design. Each ITaP research resource has different interconnect options available, and some have more than one available to all or only portions of the resource's nodes. For information on which interconnects are available, refer to the hardware specification for the resource above. Details about the specific interconnects available on Radon follow.
One Gigabit Ethernet (1GigE) is a form of Ethernet, currently the most widely used network link technology, that is able to transfer data at rates of approximately one Gigabit per second—ten times faster than 100 Mbps Ethernet. Consequently, 1GigE cable runs must be much shorter as well.
All Purdue faculty, staff, and students with the approval of their advisor may request access to Radon. Refer to the Accounts / Access page for more details on how to request access.
To submit jobs on Radon, log in to the submission host radon.rcac.purdue.edu via SSH. This submission host is actually two front-end hosts: radon-fe00 and radon-fe01. The login process randomly assigns one of these two front-ends to each login to radon.rcac.purdue.edu. While the two front-end hosts are identical, each has its own /tmp. Sharing data in /tmp during subsequent sessions may fail. ITaP advises using scratch storage for multisession, shared data instead.
Secure Shell or SSH is a way of establishing a secure (encrypted) connection between two computers. It uses public-key cryptography to authenticate the remote computer and (optionally) to allow the remote computer to authenticate the user. Its usual function involves logging in to a remote machine and executing commands, but it also supports tunneling and forwarding of X11 or arbitrary TCP connections. There are many SSH clients available for all operating systems.
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Mac OS X:
SSH works with many different means of authentication. One popular authentication method is Public Key Authentication (PKA). PKA is a method of establishing your identity to a remote computer using related sets of encryption data called keys. PKA is a more secure alternative to traditional password-based authentication with which you are probably familiar.
To employ PKA via SSH, you manually generate a keypair (also called SSH keys) in the location from where you wish to initiate a connection to a remote machine. This keypair consists of two text files: private key and public key. You keep the private key file confidential on your local machine or local home directory (hence the name "private" key). You then log in to a remote machine (if possible) and append the corresponding public key text to the end of a specific file, or have a system administrator do so on your behalf. In future login attempts, PKA compares the public and private keys to verify your identity; only then do you have access to the remote machine.
As a user, you can create, maintain, and employ as many keypairs as you wish. If you connect to a computational resource from your work laptop, your work desktop, and your home desktop, you can create and employ keypairs on each. You can also create multiple keypairs on a single local machine to serve different purposes, such as establishing access to different remote machines or establishing different types of access to a single remote machine. In short, PKA via SSH offers a secure but flexible means of identifying yourself to all kinds of computational resources.
Creating a keypair prompts you to provide a passphrase for the private key. This passphrase is different from a password in a number of ways. First, a passphrase is, as the name implies, a phrase. It can include most types of characters, including spaces, and has no limits on length. Secondly, the remote machine does not receive this passphrase for verification. Its purpose is only to allow the use of your local private key and is specific to a specific local private key.
Perhaps you are wondering why you would need a private key passphrase at all when using PKA. If the private key remains secure, why the need for a passphrase just to use it? Indeed, if the location of your private keys were always completely secure, a passphrase might not be necessary. In reality, a number of situations could arise in which someone may improperly gain access to your private key files. In these situations, a passphrase offers another level of security for you, the user who created the keypair.
Think of the private key/passphrase combination as being analogous to your ATM card/PIN combination. The ATM card itself is the object that grants access to your important accounts, and as such, should remain secure at all times—just as a private key should. But if you ever lose your wallet or someone steals your ATM card, you are glad that your PIN exists to offer another level of protection. The same is true for a private key passphrase.
When you create a keypair, you should always provide a corresponding private key passphrase. For security purposes, avoid using phrases which automated programs can discover (e.g. phrases that consist solely of words in English-language dictionaries). This passphrase is not recoverable if forgotten, so make note of it. Only a few situations warrant using a non-passphrase-protected private key—conducting automated file backups is one such situation. If you need to use a non-passphrase-protected private key to conduct automated backups to Fortress, see the No-Passphrase SSH Keys section.
SSH supports tunneling of X11 (X-Windows). If you have an X11 server running on your local machine, you may use X11 applications on remote systems and have their graphical displays appear on your local machine. These X11 connections are tunneled and encrypted automatically by your SSH client.
To use X11, you will need to have a local X11 server running on your personal machine. Both free and commercial X11 servers are available for various operating systems.
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Mac OS X:
Once you are running an X11 server, you will need to enable X11 forwarding/tunneling in your SSH client:
SSH will set the remote environment variable $DISPLAY to "localhost:XX.YY" when this is working correctly. If you had previously set your $DISPLAY environment variable to your local IP or hostname, you must remove any set/export/setenv of this variable from your login scripts. The environment variable $DISPLAY must be left as SSH sets it, which is to a random local port address. Setting $DISPLAY to an IP or hostname will not work.
If you have received a default password as part of the process of obtaining your account, you should change it immediately when you log on for the first time. Change your password from any terminal/SSH session with the command passwd. You will have the same password on all ITaP systems. If you change your password on any one ITaP system, it will change on all ITaP systems.
If you already have a Purdue career account, then you will initially receive the same username and password as your career account. There is no need to change your career account password because you have received an account on ITaP systems.
There is not currently any requirement regarding how often you must change your password for ITaP research systems, but for security reasons changing a password every six months, preferably every three months, is good practice, and other systems on campus linked to your career account do require this.
A password should employ all of the following features:
Never share your password with another user or make your password known to anyone else. Systems staff will NEVER ask for your password, by email or otherwise.
There is no local email delivery available on Radon. Radon forwards all email which it receives to mail.rcac.purdue.edu for delivery.
Your shell is the program that generates your command-line prompt and processes commands. On ITaP research systems, several common shell choices are available:
| Name | Description | Path |
|---|---|---|
| bash | A Bourne-shell (sh) compatible shell with many newer advanced features as well. Bash is one of the most common shells in use today. | /bin/bash |
| tcsh | An advanced variant on csh with all the features of modern shells. Tcsh is probably the second most popular shell in use today. | /bin/tcsh |
| zsh | An advanced shell which incorprates all the functionality of bash, tcsh, and ksh combined, usually with identical syntax. In spite of this, zsh is not in common use. | /bin/zsh |
| csh | The original C-style shell. Because tcsh offers all the functionality of csh and more, use csh only when you have specific csh-only scripts. | /bin/csh |
| ksh | Korn shell, which was an early Bourne-shell compatible shell with some additional features. Unless you are already an adept ksh user, you would probably prefer bash. | /bin/ksh |
To find out what shell you are running right now, simply use the ps command:
$ ps PID TTY TIME CMD 30181 pts/27 00:00:00 bash 30273 pts/27 00:00:00 ps
To use a different shell on a one-time or trial basis, simply type the shell name as a command. To return to your original shell, type exit:
$ ps PID TTY TIME CMD 30181 pts/27 00:00:00 bash 30273 pts/27 00:00:00 ps $ tcsh % ps PID TTY TIME CMD 30181 pts/27 00:00:00 bash 30313 pts/27 00:00:00 tcsh 30315 pts/27 00:00:00 ps % exit $
To permanently change your default login shell, use the command chsh:
$ chsh Changing login shell for myusername on *all* ACMAINT hosts. Enter existing password: ********** Old shell: nologin New shell [nologin]: /bin/tcsh Changed 'loginShell' to '/bin/tcsh' for login 'myusername' on host(s) 'host123.rcac.purdue.edu host234.rcac.purdue.edu ...'. Connection to data.rcac.purdue.edu closed.
There is a propagation delay which may last up to two hours. After the change has taken effect, your next login will start in your new shell. Moreover, you may change your shell again at any time by rerunning chsh.
File storage options on ITaP research systems include long-term storage (home directories, Fortress) and short-term storage (scratch directories, /tmp directory). Each option has different performance and intended uses, and some options vary from system to system as well. ITaP provides daily snapshots of home directories for a limited time for accidental deletion recovery. ITaP does not back up short-term storage and regularly purges old files from scratch and /tmp directories. More details about each storage option appear below.
ITaP provides home directories for long-term file storage. Each user ID has one home directory. You should use your home directory for storing important program files, scripts, input data sets, critical results, and frequently used files. You should store infrequently used files on Fortress. Your home directory becomes your current working directory, by default, when you log in.
ITaP provides daily snapshots of your home directory for a limited period of time in the event of accidental deletion. For additional security, you should store another copy of your files on more permanent storage, such as the Fortress HPSS Archive.
Your home directory physically resides within the Isilon storage system at Purdue. To find the path to your home directory, first log in then immediately enter the following:
$ pwd /home/myusername
Or from any subdirectory:
$ echo $HOME /home/myusername
Your home directory and its contents are available on all ITaP research front-end hosts and compute nodes via the Network File System (NFS).
Your home directory has a quota capping the size and/or number of files you may store within. For more information, refer to the Storage Quotas / Limits Section.
Only files which have been snap-shotted overnight are recoverable. If you lose a file the same day you created it, it is NOT recoverable.
To recover files lost from your home directory, use the flost command:
$ flost
ITaP provides scratch directories for short-term file storage only. Each file system domain has at least one scratch directory. Each user ID may access one scratch directory in a file system domain. The quota of your scratch directory is several times greater than the quota of your home directory. You should use your scratch directory for storing large temporary input files which your job reads or for writing large temporary output files which you may examine after execution of your job. You should use your home directory and Fortress for longer-term storage or for holding critical results.
Files in scratch directories are not recoverable. ITaP does not backup files in scratch directories. If you accidentally delete a file, a disk crashes, or old files are purged, they cannot be restored.
ITaP automatically removes (purges) from scratch directories all files stored for more than 90 days. Owners of these files receive a notice one week before removal via email. For more information, please refer to our Scratch File Purging Policy.
All users may access scratch directories on Radon. To find the path to your scratch directory:
$ findscratch /scratch/scratch95/m/myusername
The value of variable $RCAC_SCRATCH is your scratch directory path. Use this variable in any scripts. Your actual scratch directory path may change without warning, but this variable will remain current.
$ echo $RCAC_SCRATCH /scratch/scratch95/m/myusername
Your scratch directory on Radon may be same location and shared by some other ITaP research resources, and also distinct and not shared by other ITaP research resources. All front-end/login nodes on all computational resources are able to access the scratch directories of all other computational resources. However, compute nodes are only able to access the scratch directory allocated to that specific computational resource. ITaP may change which computational resources share scratch storage with which other computational resources as needs dictate. For more information about which computational resources share scratch volumes, please see the section Network Storage.
To find the path to someone else's scratch directory:
$ findscratch someusername /scratch/scratch95/s/someusername
Your scratch directory has a quota capping the size and number of files you may store in it. For more information, refer to the section Storage Quotas / Limits .
ITaP provides /tmp directories for short-term file storage only. Each front-end and compute node has a /tmp directory. Your program may write temporary data to the /tmp directory of the compute node on which it is running. That data is available for as long as your program is active. Once your program terminates, that temporary data is no longer available. When used properly, /tmp may provide faster local storage to an active process than any other storage option. You should use your home directory and Fortress for longer-term storage or for holding critical results.
ITaP does not perform backups for the /tmp directory and removes files from /tmp whenever space is low or whenever the system needs a reboot. In the event of a disk crash or file purge, files in /tmp are not recoverable. You should copy any important files to more permanent storage.
Long-term Storage or Permanent Storage is available to ITaP research users on the High Performance Storage System (HPSS), an archival storage system, commonly referred to as "Fortress". HPSS is a software package that manages a hierarchical storage system. Program files, data files and any other files which are not used often, but which must be saved, can be put in permanent storage. Fortress currently has a 1.2 PB capacity.
Files smaller than 100 MB have their primary copy stored on low-cost disks (disk cache), but the second copy (backup of disk cache) is on tape or optical disks. This provides a rapid restore time to the disk cache. However, the large latency to access a larger file (usually involving a copy from a tape cartridge) makes it unsuitable for direct use by any processes or jobs, even where possible. The primary and secondary copies of larger files are stored on separate tape cartridges in the Quantum (ADIC, Advanced Digital Information Corporation) tape library.
To ensure optimal performance for all users, and to keep the Fortress system healthy, please remember the following tips:
Fortress writes two copies of every file either to two tapes, or to disk and a tape, to protect against medium errors. Unfortunately, Fortress does not automatically switch to the alternate copy when it has trouble accessing the primary. If it seems to be taking an extraordinary amount of time to retrieve a file (hours), please either email rcac-help@purdue.edu or call ITaP Customer Service at 765-49-4400. We can then investigate why it is taking so long. If it is an error on the primary copy, we will instruct Fortress to switch to the alternate copy as the primary and recreate a new alternate copy.
For more information about Fortress, how it works, user guides, and how to obtain an account:
There are a variety of ways to manually transfer files to your Fortress home directory for long-term storage.
HSI, the Hierarchical Storage Interface, is the preferred method of transferring files to and from Fortress. HSI is designed to be a friendly interface for users of the High Performance Storage System (HPSS). It provides a familiar Unix-style environment for working within HPSS while automatically taking advantage of high-speed, parallel file transfers without requiring any special user knowledge.
HSI is already provided on all ITaP research systems as the command hsi. You may download HSI for the following platforms as well:
Any machines using HSI or HTAR must have all firewalls (local and departmental) configured to allow open access from the following IP addresses:
If you are unsure of how to modify your firewall settings, please consult with your department's IT support or the documentation for your operating system. Access to Fortress is restricted to on-campus networks. If you need to directly access Fortress from off-campus, please use the Purdue VPN service before connecting.
Interactive usage:
$ hsi ************************************************************************* * Purdue University * High Performance Storage System (HPSS) ************************************************************************* * This is the Purdue Data Archive, Fortress. For further information * see http://www.rcac.purdue.edu/userinfo/resources/fortress/ * * If you are having problems with HPSS, please call IT/Operational * Services at 49-44000 or send E-mail to dxul-help@purdue.edu. * ************************************************************************* Username: myusername UID: 12345 Acct: 12345(12345) Copies: 1 Firewall: off [hsi.3.5.8 Wed Sep 21 17:31:14 EDT 2011] [Fortress HSI]/home/myusername->put data1.fits put 'test' : '/home/myusername/test' ( 1024000000 bytes, 250138.1 KBS (cos=11)) [Fortress HSI]/home/myusername->lcd /tmp [Fortress HSI]/home/myusername->get data1.fits get '/tmp/data1.fits' : '/home/myusername/data1.fits' (2011/10/04 16:28:50 1024000000 bytes, 325844.9 KBS ) [Fortress HSI]/home/myusername->quit
Batch transfer file:
put data1.fits put data2.fits put data3.fits put data4.fits put data5.fits put data6.fits put data7.fits put data8.fits put data9.fits
Batch usage:
$ hsi < my_batch_transfer_file ************************************************************************* * Purdue University * High Performance Storage System (HPSS) ************************************************************************* * This is the Purdue Data Archive, Fortress. For further information * see http://www.rcac.purdue.edu/userinfo/resources/fortress/ * * If you are having problems with HPSS, please call IT/Operational * Services at 49-44000 or send E-mail to dxul-help@purdue.edu. * ************************************************************************* Username: myusername UID: 12345 Acct: 12345(12345) Copies: 1 Firewall: off [hsi.3.5.8 Wed Sep 21 17:31:14 EDT 2011] put 'data1.fits' : '/home/myusername/data1.fits' ( 1024000000 bytes, 250200.7 KBS (cos=11)) put 'data2.fits' : '/home/myusername/data2.fits' ( 1024000000 bytes, 258893.4 KBS (cos=11)) put 'data3.fits' : '/home/myusername/data3.fits' ( 1024000000 bytes, 222819.7 KBS (cos=11)) put 'data4.fits' : '/home/myusername/data4.fits' ( 1024000000 bytes, 224311.9 KBS (cos=11)) put 'data5.fits' : '/home/myusername/data5.fits' ( 1024000000 bytes, 323707.3 KBS (cos=11)) put 'data6.fits' : '/home/myusername/data6.fits' ( 1024000000 bytes, 320322.9 KBS (cos=11)) put 'data7.fits' : '/home/myusername/data7.fits' ( 1024000000 bytes, 253192.6 KBS (cos=11)) put 'data8.fits' : '/home/myusername/data8.fits' ( 1024000000 bytes, 253056.2 KBS (cos=11)) put 'data9.fits' : '/home/myusername/data9.fits' ( 1024000000 bytes, 323218.9 KBS (cos=11)) EOF detected on TTY - ending HSI session
For more information about HSI:
HTAR (short for "HPSS TAR") is a utility program that writes TAR-compatible archive files directly onto Fortress, without having to first create a local file. Its command line was originally based on the AIX tar program, with a number of extensions added to provide extra features.
HTAR is already provided on all ITaP research systems as the command htar. You may download HTAR for the following platforms as well:
Any machines using HSI or HTAR must have all firewalls (local and departmental) configured to allow open access from the following IP addresses:
If you are unsure of how to modify your firewall settings, please consult with your department's IT support or the documentation for your operating system. Access to Fortress is restricted to on-campus networks. If you need to directly access Fortress from off-campus, please use the Purdue VPN service before connecting.
Usage:
(Create a tar archive on Fortress named data.tar including all files with the extension ".fits".) $ htar -cvf data.tar *.fits HTAR: a data1.fits HTAR: a data2.fits HTAR: a data3.fits HTAR: a data4.fits HTAR: a data5.fits HTAR: a data6.fits HTAR: a data7.fits HTAR: a data8.fits HTAR: a data9.fits HTAR: a /tmp/HTAR_CF_CHK_17953_1317760775 HTAR Create complete for data.tar. 9,216,006,144 bytes written for 9 member files, max threads: 3 Transfer time: 29.622 seconds (311.121 MB/s) HTAR: HTAR SUCCESSFUL (Unpack a tar archive on Fortress named data.tar into a scratch directory for use in a batch job.) $ cd $RCAC_SCRATCH/job_dir $ htar -xvf data.tar HTAR: x data1.fits, 1024000000 bytes, 2000001 media blocks HTAR: x data2.fits, 1024000000 bytes, 2000001 media blocks HTAR: x data3.fits, 1024000000 bytes, 2000001 media blocks HTAR: x data4.fits, 1024000000 bytes, 2000001 media blocks HTAR: x data5.fits, 1024000000 bytes, 2000001 media blocks HTAR: x data6.fits, 1024000000 bytes, 2000001 media blocks HTAR: x data7.fits, 1024000000 bytes, 2000001 media blocks HTAR: x data8.fits, 1024000000 bytes, 2000001 media blocks HTAR: x data9.fits, 1024000000 bytes, 2000001 media blocks HTAR: Extract complete for data.tar, 9 files. total bytes read: 9,216,004,608 in 33.914 seconds (271.749 MB/s ) HTAR: HTAR SUCCESSFUL (Look at the contents of the data.tar HTAR archive on Fortress.) $ htar -tvf data.tar HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:30 data1.fits HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:35 data2.fits HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:35 data3.fits HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:35 data4.fits HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:35 data5.fits HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:35 data6.fits HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:35 data7.fits HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:35 data8.fits HTAR: -rw-r--r-- myusername/pucc 1024000000 2011-10-04 16:35 data9.fits HTAR: -rw------- myusername/pucc 256 2011-10-04 16:39 /tmp/HTAR_CF_CHK_17953_1317760775 HTAR: Listing complete for data.tar, 10 files 10 total objects HTAR: HTAR SUCCESSFUL (Unpack a single file, "data7.fits", from the tar archive on Fortress named data.tar into a scratch directory.) $ htar -xvf data.tar data7.fits HTAR: x data7.fits, 1024000000 bytes, 2000001 media blocks HTAR: Extract complete for data.tar, 1 files. total bytes read: 1,024,000,512 in 3.642 seconds (281.166 MB/s ) HTAR: HTAR SUCCESSFUL
For more information about HTAR:
Fortress does NOT support SCP.
Fortress does NOT support SFTP.
If you are using an ITaP research cluster front-end system, your Fortress home directory is available as /archive/fortress/home/myusername. While your Fortress home directory can be accessed via NFS in this way, this is only provided as a convenience and should not be used on a regular basis as it is extremely slow. Instead, use the HSI command to get a fast, parallelized, UNIX-like interface to your Fortress home directory.
Many environment variables specify storage locations and paths. Your login automatically defines these variables for you. You may redefine them if necessary. In addition, you define many more environment variables when you load the modules of specific applications, such as compilers and MATLAB. (See the module command section for more information.)
Use environment variables instead of actual paths whenever possible to avoid problems if the specific paths to any of these change. Some of the environment variables you should have are:
| Name | Description |
|---|---|
| USER | your username |
| HOME | path to your home directory |
| PWD | path to your current directory |
| RCAC_SCRATCH | path to scratch filesystem |
| PATH | all directories searched for commands/applications |
| HOSTNAME | name of the machine you are on |
| SHELL | your current shell (bash, tcsh, csh, ksh) |
| SSH_CLIENT | your local client's IP address |
| TERM | type of terminal or terminal emulator being used |
| OMP_NUM_THREADS | number of OpenMP threads |
By convention, environment variable names are all uppercase. You may use them on the command line or in any scripts in place of and in combination with hard-coded values:
$ ls $HOME ... $ ls $RCAC_SCRATCH/myproject ...
To find the value of any environment variable:
$ echo $RCAC_SCRATCH /scratch/scratch95/m/myusername $ echo $SHELL /bin/tcsh
To list the values of all environment variables:
$ env USER=myusername HOME=/home/myusername RCAC_SCRATCH=/scratch/scratch95/m/myusername SHELL=/bin/tcsh ...
You may create or overwrite an environment variable. To pass (export) the value of a variable in either bash or ksh:
$ export VARIABLE=value
To assign a value to an environment variable in either tcsh or csh:
$ setenv VARIABLE value
ITaP imposes some limits on your disk usage on research systems. Each filesystem (home directory, scratch directory, etc.) may have a different limit. ITaP does not implement a soft limit or quota. However, if you exceed the hard limit or limit, your write will fail. You can then either remove files you no longer need, move them to the Fortress HPSS Archive, or ask us about increasing your quota.
To discover the current quotas of your home and scratch directories:
$ myquota Type Filesystem Size Limit Use Files Limit Use ============================================================================== home extensible 5.0GB 10.0GB 50% - - - scratch scratch95 128KB 238.4GB 0% 6 100,000 0% other lustreA 8KB 476.8GB 0% 2 100,000 0%
The columns are as follows:
If you find that you reached your quota in either your home directory or your scratch file directory, obtain estimates of your disk usage. Find the top-level directories which have a high disk usage, then study the subdirectories to discover where the heaviest usage lies.
To see in a human-readable format an estimate of the disk usage of your top-level directories in your home directory:
$ du -h --max-depth=1 $HOME >myfile 32K /home/myusername/mysubdirectory_1 529M /home/myusername/mysubdirectory_2 608K /home/myusername/mysubdirectory_3
The second directory is the largest of the three, so apply command du to it.
To see in a human-readable format an estimate of the disk usage of your top-level directories in your scratch file directory:
$ du -h --max-depth=1 $RCAC_SCRATCH >myfile 160K /scratch/scratch95/m/myusername
This strategy can be very helpful in figuring out the location of your largest usage. Move unneeded files and directories to alternate long-term storage to free space in your home and scratch directories.
If you find you need additional disk space in your home directory, please first consider archiving and compressing old files and moving them to long-term storage on the Fortress HPSS Archive. If you are unable to do so, you may go to the BoilerBackpack Quota Management site and use the sliders there to increase the amount of space allocated to your research home directory vs. other storage options, up to a maximum of 100GB.
There are several options for archiving and compressing groups of files or directories on ITaP research systems. ITaP provides the following tools:
(extract contents of somefile.zip) $ unzip somefile.zip (compress file somefile.c) $ zip somefile.zip somefile.c (compress all files in a directory into one archive file) $ zip -r somefile.zip somedirectory/ (compress all ".c" files in current directory into one archive file) $ zip -r somefile.zip . -i \*.c
(extract contents of somefile.7z) $ 7za e somefile.7z (compress file somefile.c) $ 7za a somefile.7z somefile.c (compress all files in a directory into one archive file) $ 7za a somefile.7z somedirectory/ (compress all ".c" files in current directory into one archive file) $ 7za a somefile.7z *.c
(list contents of archive somefile.tar) $ tar tvf somefile.tar (extract contents of somefile.tar) $ tar xvf somefile.tar (extract contents of gzipped archive somefile.tar.gz) $ tar xzvf somefile.tar.gz (extract contents of bzip2 archive somefile.tar.bz2) $ tar xjvf somefile.tar.bz2 (extract contents of xz archive somefile.tar.xz) $ tar xJvf somefile.tar.xz (archive file somefile.c) $ tar cvf somefile.tar somefile.c (archive all ".c" files in current directory into one archive file) $ tar cvf somefile.tar.gz *.c (archive all files in a directory into one archive file) $ tar cvf somefile.tar.gz somedirectory/ (archive and gzip-compress all files in a directory into one archive file) $ tar czvf somefile.tar.gz somedirectory/ (archive and bzip2-compress all files in a directory into one archive file) $ tar cjvf somefile.tar.bz2 somedirectory/ (archive and xz-compress all files in a directory into one archive file) $ tar cJvf somefile.tar.xz somedirectory/
(compress file somefile - also removes uncompressed file) $ gzip somefile (uncompress file somefile.gz - also removes compressed file) $ gunzip somefile.gz
(compress file somefile - also removes uncompressed file) $ bzip2 somefile (uncompress file somefile.bz2 - also removes compressed file) $ bunzip2 somefile.bz2
(compress file somefile - also removes uncompressed file) $ xz somefile (uncompress file somefile.xz - also removes compressed file) $ unxz somefile.xz
Windows users can work with these same formats using some of the following software:
There are a variety of ways to transfer data to and from ITaP research systems. Which you should use depends on several factors, including the ease of use for you personally, connection speed and bandwidth, and the size and number of files which you intend to transfer.
FTP (File Transfer Protocol) is a simple data transfer mechanism. FTP does not provide secure communications, so ITaP no longer supports FTP on any ITaP research systems. However, most modern FTP clients support either SFTP or SCP, which are similar, secure protocols for file transfer. Try using one of the other methods described here instead of FTP.
SCP (Secure CoPy) is a simple way of transferring files between two machines that use the SSH (Secure SHell) protocol. You may use SCP to connect to any system where you have SSH (login) access. SCP is available as a protocol choice in some graphical file transfer programs and also as a command line program on most Linux, Unix, and Mac OS X systems. SCP can copy single files, but will also recursively copy directory contents if given a directory name.
Command-line usage:
(to a remote system from local) $ scp sourcefilename myusername@hostname:somedirectory/destinationfilename (from a remote system to local) $ scp myusername@hostname:somedirectory/sourcefilename destinationfilename (recursive directory copy to a remote system from local) $ scp sourcedirectory/ myusername@hostname:somedirectory/
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Mac OS X:
SFTP (Secure File Transfer Protocol) is a reliable way of transferring files between two machines. You may use SFTP to connect to most ITaP research systems. SFTP is available as a protocol choice in some graphical file transfer programs and also as a command-line program on most Linux, Unix, and Mac OS X systems. SFTP has more features than SCP and allows for other operations on remote files, remote directory listing, and resuming interrupted transfers. Command-line SFTP cannot recursively copy directory contents; to do so, try using SCP or graphical SFTP client.
Command-line usage:
$ sftp -B buffersize myusername@hostname
(to a remote system from local)
sftp> put sourcefile somedir/destinationfile
sftp> put -P sourcefile somedir/
(from a remote system to local)
sftp> get sourcefile somedir/destinationfile
sftp> get -P sourcefile somedir/
sftp> exit
Linux / Solaris / AIX / HP-UX / Unix:
Microsoft Windows:
Mac OS X:
LFTP is a command-line file-transfer program for Linux and Unix systems. It supports SFTP, HTTP, and HTTPS file-transfers. LFTP has additional features not provided by SFTP such as bandwidth throttling, transfer queues, and parallel transfers. Use interactively or scripted.
LFTP with parallel transfers can be much faster than SCP or SFTP, so ITaP encourages its use, when possible.
LFTP is available only on some ITaP research systems. However, it is simply a client, so the remote machine involved in a transfer does not need it (the remote system need only support SFTP).
Interactive usage:
$ lftp myusername@hostname
(transfer all ".dat" files from remote system to local)
lftp :~> mget *.dat
(transfer "filename.dat" file from local system to remote)
lftp :~> put filename.dat
(transfer a directory and all contents from remote
system to local, using 5 connections in parallel)
lftp :~> mirror --parallel=5 remotedirectory localdirectory/
(transfer a directory and all contents from local
system to remote, using 8 connections in parallel)
lftp :~> mirror -R --parallel=8 localdirectory remotedirectory/
Batch usage:
(specify all actions on command line) $ lftp myusername@hostname -e "mget *.dat" (specify all actions in the script file "mytransfer.lftp") $ lftp myusername@hostname -f mytransfer.lftp
GridFTP is a fast method of transferring large files that uses Globus authentication credentials (x509 certificates). GridFTP is available on some ITaP resources, but only to users who are members of a Grid project, such as TeraGrid, NorthWest Indiana Computational Grid (NWICG), or Open Science Grid (OSG). However, not all grids may access all ITaP resources.
For more information about how to use GridFTP, consult documentation for your participating grid.
SMB (Server Message Block), also known as CIFS, is an easy to use file transfer protocol that is useful for transferring files between ITaP research systems and a desktop or laptop. You may use SMB to connect to your home, scratch, and fortress storage directories. The SMB protocol is available on Windows, Linux, and Mac OS X. It is primarily used as a graphical means of transfer but it can also be used over the command line.
Windows:
Mac OS X:
Linux:
smbclient //samba.rcac.purdue.edu/myusername -U myusername -W onepurdue
The following table lists the third-party software which ITaP has installed on its research systems. Additional software may be available. To see the software on a specific system, run the command module avail on that system. Please contact rcac-help@purdue.edu if you are interested in the availability of software not shown in this list.
| Software | Radon | Steele | Coates, Rossmann, Hansen & Carter | Peregrine 1 |
|---|---|---|---|---|
| Abaqus ¹ | ✔ | ✔ | ✔ | ✔ |
| AcGrace | ✔ | ✔ | ✔ | ✔ |
| Amber ¹ | ✔ | ✔ | ✔ | ✘ |
| Ann | ✔ | ✔ | ✔ | ✔ |
| ANSYS ¹ | ✔ | ✔ | ✔ | ✔ |
| ATK | ✔ | ✔ | ✔ | ✔ |
| Antelope | ✘ | ✘ | ✔ | ✘ |
| Auto3Dem | ✔ | ✔ | ✔ | ✔ |
| ATLAS | ✔ | ✔ | ✔ | ✔ |
| BinUtils | ✔ | ✔ | ✔ | ✔ |
| BLAST | ✔ | ✔ | ✔ | ✔ |
| Boost | ✔ | ✔ | ✔ | ✔ |
| Cairo | ✔ | ✔ | ✔ | ✔ |
| CDAT | ✔ | ✔ | ✔ | ✔ |
| CGNSLib | ✔ | ✔ | ✔ | ✔ |
| Cmake | ✔ | ✔ | ✔ | ✔ |
| COMSOL ² | ✔ | ✔ | ✔ | ✔ |
| CPLEX ¹ | ✔ | ✔ | ✔ | ✔ |
| DX | ✔ | ✔ | ✔ | ✔ |
| Eman | ✔ | ✔ | ✔ | ✔ |
| Eman2 | ✔ | ✔ | ✔ | ✔ |
| Ferret | ✔ | ✔ | ✔ | ✔ |
| FFMPEG | ✔ | ✔ | ✔ | ✔ |
| FFTW | ✔ | ✔ | ✔ | ✔ |
| FLUENT ¹ | ✔ | ✔ | ✔ | ✔ |
| GAMESS | ✔ | ✔ | ✔ | ✔ |
| GAMS | ✔ | ✔ | ✔ | ✔ |
| Gaussian ¹ | ✔ | ✔ | ✔ | ✔ |
| GCC (Compilers) | ✔ | ✔ | ✔ | ✔ |
| GDAL | ✘ | ✔ | ✔ | ✘ |
| GemPak | ✔ | ✔ | ✔ | ✔ |
| Git | ✔ | ✔ | ✔ | ✔ |
| GLib | ✔ | ✔ | ✔ | ✔ |
| GMP | ✔ | ✔ | ✔ | ✔ |
| GMT | ✔ | ✔ | ✔ | ✔ |
| GrADS | ✔ | ✔ | ✔ | ✔ |
| GROMACS | ✔ | ✔ | ✔ | ✔ |
| GS | ✔ | ✔ | ✔ | ✔ |
| GSL | ✔ | ✔ | ✔ | ✔ |
| GTK+ | ✔ | ✔ | ✔ | ✔ |
| GTKGlarea | ✔ | ✔ | ✔ | ✔ |
| Guile | ✔ | ✔ | ✔ | ✔ |
| HarminV | ✔ | ✔ | ✔ | ✔ |
| HDF4 | ✔ | ✔ | ✔ | ✔ |
| HDF5 | ✔ | ✔ | ✔ | ✔ |
| Hy3S | ✔ | ✔ | ✔ | ✔ |
| ImageMagick | ✔ | ✔ | ✔ | ✔ |
| IMSL ¹ | ✔ | ✔ | ✔ | ✔ |
| Intel Compilers ¹ | ✔ | ✔ | ✔ | ✔ |
| Jackal ² | ✔ | ✔ | ✔ | ✔ |
| Jasper | ✔ | ✔ | ✔ | ✔ |
| Java | ✔ | ✔ | ✔ | ✔ |
| LAMMPS | ✔ | ✔ | ✔ | ✔ |
| LibCTL | ✔ | ✔ | ✔ | ✔ |
| LibPNG | ✔ | ✔ | ✔ | ✔ |
| LibTool | ✔ | ✔ | ✔ | ✔ |
| LoopyMod ² | ✔ | ✔ | ✔ | ✔ |
| Maple ¹ | ✔ | ✔ | ✔ | ✔ |
| Mathematica ¹ | ✔ | ✔ | ✔ | ✔ |
| MATLAB ¹ | ✔ | ✔ | ✔ | ✔ |
| Meep | ✔ | ✔ | ✔ | ✔ |
| MoPac | ✔ | ✔ | ✔ | ✔ |
| MPB | ✔ | ✔ | ✔ | ✔ |
| MPFR | ✔ | ✔ | ✔ | ✔ |
| MPICH | ✔ | ✔ | ✔ | ✔ |
| MPICH2 | ✔ | ✔ | ✔ | ✔ |
| MPIExec | ✔ | ✔ | ✔ | ✔ |
| MrBayes | ✔ | ✔ | ✔ | ✔ |
| MUMPS | ✔ | ✔ | ✔ | ✔ |
| MVAPICH2 | ✔ | ✔ | ✔ | ✔ |
| NAMD | ✔ | ✔ | ✔ | ✔ |
| NCL | ✔ | ✔ | ✔ | ✔ |
| NCO | ✔ | ✔ | ✔ | ✔ |
| NCView | ✔ | ✔ | ✔ | ✔ |
| NetCDF | ✔ | ✔ | ✔ | ✔ |
| NETPBM | ✔ | ✔ | ✔ | ✔ |
| NWChem | ✔ | ✔ | ✔ | ✔ |
| Octave | ✔ | ✔ | ✔ | ✔ |
| OpenMPI | ✔ | ✔ | ✔ | ✔ |
| Pango | ✔ | ✔ | ✔ | ✔ |
| Petsc | ✔ | ✔ | ✔ | ✔ |
| PGI Compilers ¹ | ✔ | ✔ | ✔ | ✔ |
| Phrap | ✔ | ✔ | ✔ | ✔ |
| Pixman | ✔ | ✔ | ✔ | ✔ |
| PKG-Config | ✔ | ✔ | ✔ | ✔ |
| Proj | ✔ | ✔ | ✔ | ✔ |
| Python | ✔ | ✔ | ✔ | ✔ |
| QTLC | ✔ | ✔ | ✔ | ✔ |
| Rational | ✔ | ✔ | ✔ | ✔ |
| R | ✔ | ✔ | ✔ | ✔ |
| SAC | ✔ | ✔ | ✔ | ✔ |
| SAS ¹ | ✔ | ✔ | ✔ | ✔ |
| ScaLAPACK | ✔ | ✔ | ✔ | ✔ |
| Seismic | ✔ | ✔ | ✔ | ✔ |
| Subversion | ✔ | ✔ | ✔ | ✔ |
| SWFTools | ✔ | ✔ | ✔ | ✔ |
| Swig | ✔ | ✔ | ✔ | ✔ |
| SysTools | ✔ | ✔ | ✔ | ✔ |
| Tao | ✔ | ✔ | ✔ | ✔ |
| TecPlot ² | ✔ | ✔ | ✔ | ✔ |
| TotalView ¹ | ✔ | ✔ | ✔ | ✔ |
| UDUNITS | ✔ | ✔ | ✔ | ✔ |
| Valgrind | ✘ | ✔ | ✔ | ✘ |
| VMD | ✔ | ✔ | ✔ | ✔ |
| Weka | ✔ | ✔ | ✔ | ✔ |
|
¹ Only users on Purdue's West Lafayette campus may use this software. ² Only specific research groups may use this software. |
||||
Please contact rcac-help@purdue.edu for specific questions about software license restrictions on ITaP research systems.
ITaP uses the module command as the preferred method to manage your processing environment. With this command, you may load applications and compilers along with their libraries and paths. Modules are packages which you load and unload as needed.
Please use the module command and do not manually configure your environment, as ITaP staff will frequently make changes to the specifics of various packages. If you use the module command to manage your environment, these changes will not be noticeable.
To view a brief usage report:
$ module
Below follows a short introduction to the module command. You can see more in the man page for module.
To see what modules are available on this system:
$ module avail
To see which versions of a specific compiler are available on this system:
$ module avail gcc $ module avail intel $ module avail pgi
To see available modules with MPI:
$ module avail mvapich $ module avail openmpi
To see available modules for specific provided applications, use names from the list obtained with the command module avail:
$ module avail abaqus $ module avail matlab $ module avail mathematica
All modules consist of both a name and a version number. When loading a module, you may use only the name to load the default version, or you may specify which version you wish to load.
For each cluster, ITaP makes a recommendation regarding the set of compiler, math library, and message-passing library for parallel code. To load the recommended set:
$ module load devel
To verify what you loaded:
$ module list
To load the default version of a specific compiler, choose one of the following commands:
$ module load gcc $ module load intel $ module load pgi
To load a specific version of the Intel compiler, include the version number:
$ module load intel/11.1.072
When running a job, you must use the job submission file to load on the compute node(s) any relevant modules. Loading modules on the front end before submitting your job is sufficient when using the front end during the development phase of your application but not sufficient when using the compute node(s) during the production phase. You must load the same modules on the compute node(s).
To unload a module, enter the same module name used to load that module. Unloading will attempt to undo the environmental changes which a previous load command installed.
To unload the default version of a specific compiler:
$ module unload gcc $ module unload intel $ module unload pgi
To unload a specific version of the Intel compiler, include the same version number used to load that Intel compiler:
$ module unload intel/11.1.072
Apply the same methods to manage the modules of provided applications:
$ module load matlab $ module unload matlab
To unload all currently loaded modules:
module purge
To see currently loaded modules:
$ module list Currently Loaded Modulefiles: 1) intel/12.1
To unload a module:
$ module unload intel $ module list No Modulefiles Currently Loaded.
To learn more about what a module does to your environment, you may use the module show module_name command, where module_name is any name in the list from command module avail. This can be either a default name like "intel", "gcc", "pgi", and "matlab", or a specific version of a module, such as "intel/11.1.072". Here is an example showing what loading the default Intel compiler does to the processing environment:
$ module show intel ------------------------------------------------------------------- /opt/modules/modulefiles/intel/12.1: module-whatis invoke Intel 12.1.0 Compilers (64-bit) prepend-path PATH /opt/intel/composer_xe_2011_sp1.6.233/bin/intel64 prepend-path LD_LIBRARY_PATH /opt/intel/composer_xe_2011_sp1.6.233/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 prepend-path LD_LIBRARY_PATH /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64 prepend-path LD_LIBRARY_PATH /opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 prepend-path LIBRARY_PATH /opt/intel/composer_xe_2011_sp1.6.233/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 prepend-path NLSPATH /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64/locale/%l_%t/%N prepend-path NLSPATH /opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64/locale/%l_%t/%N prepend-path CPATH /opt/intel/composer_xe_2011_sp1.6.233/tbb/include setenv CC icc setenv CXX icpc setenv FC ifort setenv ICC_HOME /opt/intel/composer_xe_2011_sp1.6.233 setenv IFORT_HOME /opt/intel/composer_xe_2011_sp1.6.233 setenv MKL_HOME /opt/intel/composer_xe_2011_sp1.8.273/mkl setenv TBBROOT /opt/intel/composer_xe_2011_sp1.6.233/tbb setenv LAPACK_INCLUDE -I/opt/intel/composer_xe_2011_sp1.8.273/mkl/include setenv LAPACK_INCLUDE_F95 -I/opt/intel/composer_xe_2011_sp1.8.273/mkl/include/intel64/lp64 setenv LINK_LAPACK -L/opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -Xlinker -rpath -Xlinker /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64 -Xlinker -rpath -Xlinker /opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 setenv LINK_LAPACK_STATIC -L/opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64 -Bstatic -Wl,--start-group /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64/libmkl_intel_lp64.a /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64/libmkl_intel_thread.a /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64/libmkl_core.a -Wl,--end-group -Bdynamic -liomp5 -lpthread -Xlinker -rpath -Xlinker /opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 setenv LINK_LAPACK95 -L/opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64 -lmkl_lapack95_lp64 -lmkl_blas95_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -Xlinker -rpath -Xlinker /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64 -Xlinker -rpath -Xlinker /opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 setenv LINK_LAPACK95_STATIC -L/opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64 -lmkl_lapack95_lp64 -lmkl_blas95_lp64 -Bstatic -Wl,--start-group /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64/libmkl_intel_lp64.a /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64/libmkl_intel_thread.a /opt/intel/composer_xe_2011_sp1.8.273/mkl/lib/intel64/libmkl_core.a -Wl,--end-group -Bdynamic -liomp5 -lpthread -Xlinker -rpath -Xlinker /opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64 -------------------------------------------------------------------
To show what loading a specific Intel compiler version does to the processing environment:
$ module show intel/11.1.072 ------------------------------------------------------------------- /opt/modules/modulefiles/intel/11.1.072: module-whatis invoke Intel 11.1.072 64-bit Compilers prepend-path PATH /opt/intel/Compiler/11.1/072/bin/intel64 prepend-path LD_LIBRARY_PATH /opt/intel/mkl/10.2.5.035/lib/em64t prepend-path LD_LIBRARY_PATH /opt/intel/Compiler/11.1/072/lib/intel64 prepend-path NLSPATH /opt/intel/mkl/10.2.5.035/lib/em64t/locale/%l_%t/%N prepend-path NLSPATH /opt/intel/Compiler/11.1/072/idb/intel64/locale/%l_%t/%N prepend-path NLSPATH /opt/intel/Compiler/11.1/072/lib/intel64/locale/%l_%t/%N setenv CC icc setenv CXX icpc setenv FC ifort setenv F90 ifort setenv ICC_HOME /opt/intel/Compiler/11.1/072 setenv IFORT_HOME /opt/intel/Compiler/11.1/072 setenv MKL_HOME /opt/intel/mkl/10.2.5.035 setenv LAPACK_INCLUDE -I/opt/intel/mkl/10.2.5.035/include setenv LAPACK_INCLUDE_F95 -I/opt/intel/mkl/10.2.5.035/include/em64t/lp64 setenv LINK_LAPACK -L/opt/intel/mkl/10.2.5.035/lib/em64t -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -Xlinker -rpath -Xlinker /opt/intel/mkl/10.2.5.035/lib/em64t setenv LINK_LAPACK_STATIC -Bstatic -Wl,--start-group /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_lp64.a /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.a /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_core.a -Wl,--end-group -Bdynamic -liomp5 -lpthread setenv LINK_LAPACK95 -L/opt/intel/mkl/10.2.5.035/lib/em64t -lmkl_lapack95_lp64 -lmkl_blas95_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -Xlinker -rpath -Xlinker /opt/intel/mkl/10.2.5.035/lib/em64t setenv LINK_LAPACK95_STATIC -lmkl_lapack95_lp64 -lmkl_blas95_lp64 -Bstatic -Wl,--start-group /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_lp64.a /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_intel_thread.a /opt/intel/mkl/10.2.5.035/lib/em64t/libmkl_core.a -Wl,--end-group -Bdynamic -liomp5 -lpthread -------------------------------------------------------------------
Compilers are available on Radon for Fortran 77, Fortran 90, Fortran 95, C, and C++. The compilers can produce general-purpose and architecture-specific optimizations to improve performance. These include loop-level optimizations, inter-procedural analysis and cache optimizations. The compilers support automatic and user-directed parallelization of Fortran, C, and C++ applications for multiprocessing execution. More detailed documentation on each compiler set available on Radon follows.
On Radon, ITaP recommends the following set of compiler, math library, and message-passing library for parallel code:
To load the recommended set:
$ module load devel $ module list
One or more versions of the Intel compiler set (compilers and associated libraries) are available on Radon. To discover which ones:
$ module avail intel/ $ module avail openmpi $ module avail mvapich $ module avail mpich
Choose an appropriate Intel module and load it. For example:
module load intel
Here are some examples for the Intel compilers:
| Language | Serial Program | MPI Program | OpenMP Program |
|---|---|---|---|
| Fortran77 |
$ ifort myprogram.f -o myprogram |
$ mpif77 myprogram.f -o myprogram |
$ ifort -openmp myprogram.f -o myprogram |
| Fortran90 |
$ ifort myprogram.f90 -o myprogram |
$ mpif90 myprogram.f90 -o myprogram |
$ ifort -openmp myprogram.f90 -o myprogram |
| Fortran95 | (same as Fortran 90) | (same as Fortran 90) | (same as Fortran 90) |
| C |
$ icc myprogram.c -o myprogram |
$ mpicc myprogram.c -o myprogram |
$ icc -openmp myprogram.c -o myprogram |
| C++ |
$ icc myprogram.cpp -o myprogram |
$ mpiCC myprogram.cpp -o myprogram |
$ icc -openmp myprogram.cpp -o myprogram |
More information on compiler options appear in the official man pages, which are accessible with the man command after loading the appropriate compiler module, or online here:
For more documentation on the Intel compilers:
The official name of the GNU compilers is "GNU Compiler Collection" or "GCC". One or more versions of the GNU compiler set (compilers and associated libraries) are available on Radon. To discover which ones:
$ module avail gcc $ module avail openmpi $ module avail mvapich $ module avail mpich
Choose an appropriate GCC module and load it. For example:
module load gcc
An older version of the GNU compiler will be in your path by default. Do NOT use this version. Instead, load a newer version using the command module load gcc.
Here are some examples for the GNU compilers:
| Language | Serial Program | MPI Program | OpenMP Program |
|---|---|---|---|
| Fortran77 |
$ gfortran myprogram.f -o myprogram |
$ mpif77 myprogram.f -o myprogram |
$ gfortran -fopenmp myprogram.f -o myprogram |
| Fortran90 |
$ gfortran myprogram.f90 -o myprogram |
$ mpif90 myprogram.f90 -o myprogram |
$ gfortran -fopenmp myprogram.f90 -o myprogram |
| Fortran95 |
$ gfortran myprogram.f95 -o myprogram |
$ mpif90 myprogram.f95 -o myprogram |
$ gfortran -fopenmp myprogram.f95 -o myprogram |
| C |
$ gcc myprogram.c -o myprogram |
$ mpicc myprogram.c -o myprogram |
$ gcc -fopenmp myprogram.c -o myprogram |
| C++ |
$ g++ myprogram.cpp -o myprogram |
$ mpiCC myprogram.cpp -o myprogram |
$ g++ -fopenmp myprogram.cpp -o myprogram |
More information on compiler options appear in the official man pages, which are accessible with the man command after loading the appropriate compiler module, or online here:
For more documentation on the GCC compilers:
One or more versions of the PGI compiler set (compilers and associated libraries) are available on Radon. To discover which ones:
$ module avail pgi $ module avail openmpi $ module avail mvapich $ module avail mpich
Choose an appropriate PGI module and load it. For example:
module load pgi
Here are some examples for the PGI compilers:
| Language | Serial Program | MPI Program | OpenMP Program |
|---|---|---|---|
| Fortran77 |
$ pgf77 myprogram.f -o myprogram |
$ mpif77 myprogram.f -o myprogram |
$ pgf77 -mp myprogram.f -o myprogram |
| Fortran90 |
$ pgf90 myprogram.f90 -o myprogram |
$ mpif90 myprogram.f90 -o myprogram |
$ pgf90 -mp myprogram.f90 -o myprogram |
| Fortran95 |
$ pgf95 myprogram.f95 -o myprogram |
$ mpif90 myprogram.f95 -o myprogram |
$ pgf95 -mp myprogram.f95 -o myprogram |
| C |
$ pgcc myprogram.c -o myprogram |
$ mpicc myprogram.c -o myprogram |
$ pgcc -mp myprogram.c -o myprogram |
| C++ |
$ pgCC myprogram.cpp -o myprogram |
$ mpiCC myprogram.cpp -o myprogram |
$ pgCC -mp myprogram.cpp -o myprogram |
More information on compiler options can be found in the official man pages, which are accessible with the man command after loading the appropriate compiler module, or online here:
For more documentation on the PGI compilers:
A serial program is a single process which executes as a sequential stream of instructions on one computer. Compilers capable of serial programming are available for C, C++, and versions of Fortran.
Here are a few sample serial programs:
To load a compiler, enter one of the following:
$ module load intel $ module load gcc $ module load pgi
The following table illustrates how to compile your serial program:
| Language | Intel Compiler | GNU Compiler | PGI Compiler |
|---|---|---|---|
| Fortran 77 | $ ifort myprogram.f -o myprogram |
$ gfortran myprogram.f -o myprogram |
$ pgf77 myprogram.f -o myprogram |
| Fortran 90 | $ ifort myprogram.f90 -o myprogram |
$ gfortran myprogram.f90 -o myprogram |
$ pgf90 myprogram.f90 -o myprogram |
| Fortran 95 | $ ifort myprogram.f90 -o myprogram |
$ gfortran myprogram.f95 -o myprogram |
$ pgf95 myprogram.f95 -o myprogram |
| C | $ icc myprogram.c -o myprogram |
$ gcc myprogram.c -o myprogram |
$ pgcc myprogram.c -o myprogram |
| C++ | $ icc myprogram.cpp -o myprogram |
$ g++ myprogram.cpp -o myprogram |
$ pgCC myprogram.cpp -o myprogram |
The Intel, GNU and PGI compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".
A message-passing program is a set of processes (often multiple copies of a single process) that take advantage of distributed-memory systems by communicating with each other via the sending and receiving of messages. The Message-Passing Interface (MPI) is a specific implementation of the message-passing model and is a collection of library functions. Open MPI, MPICH2 and MVAPICH2 are three implementations of the MPI-2 standard. Libraries for Open MPI, MPICH2 and MVAPICH2 and compilers for C, C++, and versions of Fortran are available.
MPI programs require including a header file:
| Language | Header Files |
|---|---|
| Fortran 77 | INCLUDE 'mpif.h' |
| Fortran 90 | INCLUDE 'mpif.h' |
| Fortran 95 | INCLUDE 'mpif.h' |
| C | #include <mpi.h> |
| C++ | #include <mpi.h> |
Here are a few sample programs using MPI:
To see the available MPI libraries:
$ module avail openmpi $ module avail mvapich $ module avail mpich
The following table illustrates how to compile your message-passing program. Any compiler flags accepted by ifort/icc compilers are compatible with mpif77/mpicc.
| Language | Intel Compiler | GNU Compiler | PGI Compiler |
|---|---|---|---|
| Fortran 77 | $ mpif77 program.f -o program |
$ mpif77 program.f -o program |
$ mpif77 program.f -o program |
| Fortran 90 | $ mpif90 program.f90 -o program |
$ mpif90 program.f90 -o program |
$ mpif90 program.f90 -o program |
| Fortran 95 | $ mpif90 program.f95 -o program |
$ mpif90 program.f95 -o program |
$ mpif90 program.f95 -o program |
| C | $ mpicc program.c -o program |
$ mpicc program.c -o program |
$ mpicc program.c -o program |
| C++ | $ mpiCC program.C -o program |
$ mpiCC program.C -o program |
$ mpiCC program.C -o program |
The Intel, GNU and PGI compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".
Here is some more documentation from other sources on the MPI libraries:
A shared-memory program is a single process that takes advantage of a multi-core processor and its shared memory to achieve a form of parallel computing called multithreading. Open Multi-Processing (OpenMP) is a specific implementation of the shared-memory model and is a collection of parallelization directives, library routines, and environment variables. It distributes the work of a process over several cores of a multi-core processor. Compilers which include OpenMP are available for C, C++, and versions of Fortran.
OpenMP programs require including a header file:
| Language | Header Files |
|---|---|
| Fortran 77 | |
| Fortran 90 | use omp_lib |
| Fortran 95 | use omp_lib |
| C | #include <omp.h> |
| C++ | #include <omp.h> |
Sample programs illustrate task parallelism of OpenMP:
A sample program illustrates loop-level (data) parallelism of OpenMP:
To load a compiler, enter one of the following:
$ module load intel $ module load gcc $ module load pgi
The following table illustrates how to compile your shared-memory program. Any compiler flags accepted by ifort/icc compilers are compatible with OpenMP.
| Language | Intel Compiler | GNU Compiler | PGI Compiler |
|---|---|---|---|
| Fortran 77 | $ ifort -openmp myprogram.f -o myprogram |
$ gfortran -fopenmp myprogram.f -o myprogram |
$ pgf77 -mp myprogram.f -o myprogram |
| Fortran 90 | $ ifort -openmp myprogram.f90 -o myprogram |
$ gfortran -fopenmp myprogram.f90 -o myprogram |
$ pgf90 -mp myprogram.f90 -o myprogram |
| Fortran 95 | $ ifort -openmp myprogram.f90 -o myprogram |
$ gfortran -fopenmp myprogram.f95 -o myprogram |
$ pgf95 -mp myprogram.f95 -o myprogram |
| C | $ icc -openmp myprogram.c -o myprogram |
$ gcc -fopenmp myprogram.c -o myprogram |
$ pgcc -mp myprogram.c -o myprogram |
| C++ | $ icc -openmp myprogram.cpp -o myprogram |
$ g++ -fopenmp myprogram.cpp -o myprogram |
$ pgCC -mp myprogram.cpp -o myprogram |
The Intel, GNU and PGI compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".
Here is some more documentation from other sources on OpenMP:
A hybrid program combines both message-passing and shared-memory attributes to take advantage of compute clusters with multi-core compute nodes. Libraries for Open MPI, MPICH2, and MVAPICH2 and compilers which include OpenMP for C, C++, and versions of Fortran are available.
Hybrid programs require including header files:
| Language | Header Files |
|---|---|
| Fortran 77 | INCLUDE 'mpif.h' |
| Fortran 90 | use omp_lib INCLUDE 'mpif.h' |
| Fortran 95 | use omp_lib INCLUDE 'mpif.h' |
| C | #include <mpi.h> #include <omp.h> |
| C++ | #include <mpi.h> #include <omp.h> |
A few examples illustrate hybrid programs with task parallelism of OpenMP:
This example illustrates a hybrid program with loop-level (data) parallelism of OpenMP:
To see the available MPI libraries:
$ module avail openmpi $ module avail mvapich $ module avail mpich
The following table illustrates how to compile your hybrid (MPI/OpenMP) program. Any compiler flags accepted by ifort/icc compilers are compatible with mpif77/mpicc and OpenMP.
| Language | Intel Compiler | GNU Compiler | PGI Compiler |
|---|---|---|---|
| Fortran 77 | $ mpif77 -openmp myprogram.f -o myprogram |
$ mpif77 -fopenmp myprogram.f -o myprogram |
$ mpif77 -mp myprogram.f -o myprogram |
| Fortran 90 | $ mpif90 -openmp myprogram.f90 -o myprogram |
$ mpif90 -fopenmp myprogram.f90 -o myprogram |
$ mpif90 -mp myprogram.f90 -o myprogram |
| Fortran 95 | $ mpif90 -openmp myprogram.f90 -o myprogram |
$ mpif90 -fopenmp myprogram.f95 -o myprogram |
$ mpif90 -mp myprogram.f95 -o myprogram |
| C | $ mpicc -openmp myprogram.c -o myprogram |
$ mpicc -fopenmp myprogram.c -o myprogram |
$ mpicc -mp myprogram.c -o myprogram |
| C++ | $ mpiCC -openmp myprogram.C -o myprogram |
$ mpiCC -fopenmp myprogram.C -o myprogram |
$ mpiCC -mp myprogram.C -o myprogram |
The Intel, GNU and PGI compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".
Some mathematical libraries are available on Radon. More detailed documentation about the libraries available on Radon follows.
Intel Math Kernel Library (MKL) contains ScaLAPACK, LAPACK, Sparse Solver, BLAS, Sparse BLAS, CBLAS, GMP, FFTs, DFTs, VSL, VML, and Interval Arithmetic routines. MKL resides in the directory /opt/intel/mkl/9.1, and it has the following subdirectory structure:
Here are some example combinations of linking options:
(static linking of LAPACK and Kernels)
$ myfortrancompiler myprogram.f -L${MKLPATH} -lmkl_lapack -lmkl_ia32 -lguide -lpthread
(static linking of Fortran-95 LAPACK Interface and Kernels)
$ myfortrancompiler myprogram.f95 -L${MKLPATH} -lmkl_lapack95 -lmkl_lapack -lmkl_ia32 -lguide -lpthread
(static linking of BLAS, Sparse BLAS, GMP, VML/VSL, Interval Arithmetic, and FFT/DFT)
$ myccompiler myprogram.c -L${MKLPATH} -lmkl_ia32 -lguide -lpthread -lm
(dynamic linking of BLAS or FFTs)
$ myccompiler myprogram.c -L${MKLPATH} -lmkl -lguide -lpthread
ITaP recommends that you use dynamic linking of libguide. If so, define LD_LIBRARY_PATH such that you are using the correct version of libguide at run time. If you use static linking of libguide (discouraged), then:
Here are some more documentation from other sources on the Intel MKL:
You may write different parts of a computing application in different programming languages. For example, an application might incorporate older, legacy code which performs numerical calculations written in Fortran. Systems functions might use C. A newer, main program which binds together all older code might use C++ to take advantage of the object orientation. This section illustrates a few simple examples.
For more information about mixing programming languages:
If the source file ends with .F, .fpp, or .FPP, cpp automatically preprocesses the source code before compilation. If you want to use the C preprocessor with source files that do not end with .F, use the following compiler option to specify the filename suffix:
$ gfortran -x f77-cpp-input myprogram.f
$ ... -cxxlib -gcc/-cxxlib -iccFor example, to preprocess source files that end with .f:
$ ifort -cpp myprogram.f
Generally, it is advisable to rename your file from myprogram.f to myprogram.F. The preprocessor then automatically runs when you compile the file.
For more information on combining C/C++ and Fortran:
A C language program calls routines written in Fortran 90, C, and C++. The routines change the value of a character argument. To understand what makes this example work, you must be aware of a few simple issues.
To discover how the chosen Fortran compiler handles the names of routines, apply the Linux command nm to the object file: nm filename.o. The Fortran compilers used in this example append an underscore after the name of a routine. The C program calls the Fortran routine with the underscore character.
Fortran uses pass-by-reference while C uses pass-by-value. Therefore, to pass a value from a Fortran routine to a C program requires the argument in the call to the Fortran routine to be a pointer (ampersand "&"). To pass a value from a C++ routine to a C program, the C++ routine may use the pass-by-reference syntax (ampersand "&") of C++ while the C program again specifies a pointer (ampersand "&") in the call to the C++ routine.
The C++ compiler must know at the time of compiling the C++ routine that the C program will invoke the C++ routine with the C-style interface rather than the C++ interface.
The following files of source code illustrate these technical details:
Separately compile each source code file with the appropriate compiler into an object (.o) file. Then link the object files into a single executable file (a.out):
| Compiler | Intel | GNU | PGI |
|---|---|---|---|
| C Main Program | $ module load intel $ icc -c main.c $ ifort -c f90.f90 $ icc -c c.c $ icc -c cpp.cpp $ icc -lstdc++ main.o f90.o c.o cpp.o |
$ module load gcc $ gcc -c main.c $ gfortran -c f90.f90 $ gcc -c c.c $ g++ -c cpp.cpp $ gcc -lstdc++ main.o f90.o c.o cpp.o |
$ module load pgi $ pgcc -c main.c $ pgcc -c c.c $ pgCC -c cpp.cpp $ pgf90 -Mnomain main.o c.o cpp.o f90.f90 |
The results show that each routine successfully returns a different character to the main program:
$ a.out main(), initial value: chr=X main(), after function subr_f_(): chr=f main(), after function func_c(): chr=c main(), after function func_cpp(): chr=+ Exit main.c
A C++ language program calls routines written in Fortran 90, C, and C++. The routines change the value of a character argument. To understand what makes this example work, you must be aware of a few simple issues.
To discover how the chosen Fortran compiler handles the names of routines, apply the Linux command nm to the object file: nm filename.o. The Fortran compilers used in this example append an underscore after the name of a routine. The C++ program calls the Fortran routine with the underscore character.
Fortran uses pass-by-reference while C++ uses pass-by-value. Therefore, to pass a value from a Fortran routine to a C++ program requires the argument in the call to the Fortran routine to be a pointer (ampersand "&"). To pass a value from a C routine to a C++ program, the C routine must declare a parameter as a pointer (asterisk "*") while the C++ program again specifies a pointer (ampersand "&") in the call to the C routine.
The C++ compiler must know at the time of compiling the C++ program that the C++ program will invoke the Fortran and C routines with the C-style interface rather than the C++ interface.
The following files of source code illustrate these technical details:
Separately compile each source code file with the appropriate compiler into an object (.o) file. Then link the object files into a single executable file (a.out):
| Compiler | Intel | GNU | PGI |
|---|---|---|---|
| C++ Main Program | $ module load intel $ icc -c main.cpp $ ifort -c f90.f90 $ icc -c c.c $ icc -c cpp.cpp $ icc -lstdc++ main.o f90.o c.o cpp.o |
$ module load gcc $ g++ -c main.cpp $ gfortran -c f90.f90 $ gcc -c c.c $ g++ -c cpp.cpp $ g++ main.o f90.o c.o cpp.o |
$ module load pgi $ pgCC -c main.cpp $ pgf90 -c f90.f90 $ pgcc -c c.c $ pgCC -c cpp.cpp $ pgCC -L../lib main.o c.o cpp.o f90.o -pgf90libs |
The results show that each routine successfully returns a different character to the main program:
$ a.out main(), initial value: chr=X main(), after function subr_f_(): chr=f main(), after function func_c(): chr=c main(), after function func_cpp(): chr=+ Exit main.cpp
A Fortran language program calls routines written in Fortran 90, C, and C++. The routines change the value of a character argument. To understand what makes this example work, you must be aware of a few simple issues.
To discover how the chosen Fortran compiler handles the names of routines, apply the Linux command nm to the object file: nm filename.o. The Fortran compilers used in this example append an underscore after the name of a routine, so the definitions of the C and C++ routines must include the underscore. The Fortran program calls these routines without the underscore character in the Fortran source code.
Fortran uses pass-by-reference while C uses pass-by-value. Therefore, to pass a value from a C routine to a Fortran program requires the parameter of the C routine to be a pointer (asterisk "*") in the C routine's definition. To pass a value from a C++ routine to a Fortran program, the C++ routine may use the pass-by-reference syntax (ampersand "&") of C++ in its definition.
The C++ compiler must know at the time of compiling the C++ routine that the Fortran program will invoke the C++ routine with the C-style interface rather than the C++ interface.
The following files of source code illustrate these technical details:
Separately compile each source code file with the appropriate compiler into an object (.o) file. Then link the object files into a single executable file (a.out):
| Compiler | Intel | GNU | PGI |
|---|---|---|---|
| Fortran 90 Main Program | $ module load intel $ ifort -c main.f90 $ ifort -c f90.f90 $ icc -c c.c $ icc -c cpp.cpp $ ifort -lstdc++ main.o f90.o c.o cpp.o |
$ module load gcc $ gfortran -c main.f90 $ gfortran -c f90.f90 $ gcc -c c.c $ g++ -c cpp.cpp $ gfortran -lstdc++ main.o c.o cpp.o f90.o |
$ module load pgi $ pgf90 -c main.f90 $ pgf90 -c f90.f90 $ pgcc -c c.c $ pgCC -c cpp.cpp $ pgf90 main.o c.o cpp.o f90.o |
The results show that each routine successfully returns a different character to the main program:
$ a.out main(), initial value: chr=X main(), after function subr_f(): chr=f main(), after function subr_c(): chr=c main(), after function func_cpp(): chr=+ Exit mixlang
There are two methods for submitting jobs to the Radon community cluster. First, you may use the portable batch system (PBS) to submit jobs directly to a queue on Radon. PBS performs job scheduling. Jobs may be serial, message-passing, shared-memory, or hybrid (message-passing + shared-memory) programs. You may use either the batch or interactive mode to run your jobs. Use the batch mode for finished programs; use the interactive mode only for debugging. Secondly, since the Radon cluster is a part of BoilerGrid, you may submit serial jobs to BoilerGrid and specifically request compute nodes on Radon.
The Portable Batch System (PBS) is a richly featured workload management system providing job scheduling and job management interface on computing resources, including Linux clusters. With PBS, a user requests resources and submits a job to a queue. The system will then take jobs from queues, allocate the necessary nodes, and execute them in as efficient a manner as it can.
Do NOT run large, long, multi-threaded, parallel, or CPU-intensive jobs on a front-end login host. All users share the front-end hosts, and running anything but the smallest test job will negatively impact everyone's ability to use Radon. Always use PBS to submit your work as a job. You may even submit interactive sessions as jobs. This section of documentation will explain how to use PBS.
Radon has only one queue, the "workq" queue, and it is open to all users of the system.
To submit work to a PBS queue, you must first create a job submission file. This job submission file is essentially a simple shell script. It will set any required environment variables, load any necessary modules, create or modify files and directories in your scratch space, and invoke any applications that you need. However, a job submission file can be as simple as the path to your application:
#!/bin/sh -l # FILENAME: myjobsubmissionfile # Print the hostname of the compute node on which this job is running. /bin/hostname
Or, as simple as listing the names of compute nodes assigned to your job:
#!/bin/sh -l # FILENAME: myjobsubmissionfile # PBS_NODEFILE contains the names of assigned compute nodes. cat $PBS_NODEFILE
PBS sets several potentially useful environment variables which you may use within your job submission files. Here is a list of some:
| Name | Description |
|---|---|
| PBS_O_WORKDIR | Absolute path of the current working directory when you submitted this job |
| PBS_JOBID | Job ID number assigned to this job by the batch system |
| PBS_JOBNAME | Job name supplied by the user |
| PBS_NODEFILE | File containing the list of nodes assigned to this job |
| PBS_O_HOST | Hostname of the system where you submitted this job |
| PBS_O_QUEUE | Name of the original queue to which you submitted this job |
| PBS_O_SYSTEM | Operating system name given by uname -s where you submitted this job |
| PBS_ENVIRONMENT | "PBS_BATCH" if this job is a batch job, or "PBS_INTERACTIVE" if this job is an interactive job |
Here is an example of a commonly used PBS variable, making sure a job runs from within the same directory that you submitted it from:
#!/bin/sh -l # FILENAME: myjobsubmissionfile # Change to the directory from which you originally submitted this job. cd $PBS_O_WORKDIR # Print out the current working directory path. pwd
You may also find the need to load a module to run a job on a compute node. Loading a module on a front end does NOT automatically load that module on the compute node where a job runs. You must use the job submission file to load a module on the compute node:
#!/bin/sh -l # FILENAME: myjobsubmissionfile # Load the module for NetPBM. module load netpbm # Convert a PostScript file to GIF format using NetPBM tools. pstopnm myfilename.ps | ppmtogif > myfilename.gif
Once you have a job submission file, you may submit this script to PBS using the qsub command. PBS will find an available processor core or a set of processor cores and run your job there, or leave your job in a queue until some become available. At submission time, you may also optionally specify many other attributes or job requirements you have regarding where your jobs will run.
To submit your serial job to one processor core on one compute node with no special requirements:
$ qsub myjobsubmissionfile
The previous example uses two default cases involving compute nodes and processor cores:
$ qsub -l nodes=1:ppn=1 myjobsubmissionfile
To submit your job to a specific queue:
$ qsub -q myqueuename myjobsubmissionfile
By default, each job receives 30 minutes of wall time for its execution. The wall time is the total time in real clock time (not CPU cycles) that you believe your job will need to run to completion. If you know that your job will not need more than a certain amount of time to run, it is very much to your advantage to request less than the maximum allowable wall time, as this may allow your job to schedule and run sooner. To request the specific wall time of 1 hour and 30 minutes:
$ qsub -l nodes=1:ppn=1,walltime=01:30:00 myjobsubmissionfile
To request more than one processor core on one or more compute nodes:
$ qsub -l nodes=2:ppn=4 myjobsubmissionfile
The nodes resource indicates how many virtual nodes you would like reserved for your job. By default, PBS maps the nodes resource to a virtual node (that is, directly to a processor, not a full physical compute node). The node property ppn specifies how many processor cores you need on each virtual node. The previous example requests 2 virtual nodes with 4 processor cores each. PBS may or may not assign virtual nodes on different physical compute nodes. Each compute node in Radon has 8 processor cores. So, the two virtual nodes of this example can reside on a single compute node. Explanations regarding the distribution of your job across different compute nodes for parallel programs appear in the sections covering specific parallel programming libraries.
Here is a typical list of compute node names from a qsub command requesting 2 virtual nodes and 4 processor cores:
radon-a639 radon-a639 radon-a639 radon-a639 radon-a638 radon-a638 radon-a638 radon-a638
Normally, compute nodes running your job may also be running jobs from other users. ITaP research systems have many processor cores in each compute node, so node sharing allows more efficient use of the system. However, if you have special needs that prohibit others from effectively sharing a compute node with your job, such as needing all of the memory on a compute node, you may request exclusive access to any compute nodes allocated to your job.
To request exclusive access to a compute node, set ppn to the maximum number of processor cores physically available on a compute node:
$ qsub -l nodes=1:ppn=8 myjobsubmissionfile
Note that if you request more than ppn=8 on Radon, your job will never run, because compute nodes in Radon have only 8 processor cores each.
If more convenient, you may also specify any command line options to qsub from within your job submission file, using a special form of comment:
#!/bin/sh -l # FILENAME: myjobsubmissionfile #PBS -q myqueuename #PBS -l nodes=1:ppn=8 #PBS -l walltime=01:30:00 #PBS -N myjobname # Print the hostname of the compute node on which this job is running. /bin/hostname
If an option is present in both your job submission file and on the command line, the option on the command line will take precedence.
After you submit your job with qsub, it can reside in a queue for minutes, hours, or even weeks. How long it takes for a job to start depends on the specific queue, the number of compute nodes requested, the amount of wall time requested, and what other jobs already waiting in that queue requested as well. It is impossible to say for sure when any given job will start. For best results, request no more resources than your job requires.
PBS catches only output written to standard output and standard error. Standard output (output normally sent to the screen) will appear in your directory in a file whose extension begins with the letter "o", for example myjobsubmissionfile.o1234, where "1234" represents the PBS job ID. Errors that occurred during the job run and written to standard error (output also normally sent to the screen) will appear in your directory in a file whose extension begins with the letter "e", for example myjobsubmissionfile.e1234. Often, the error file is empty. If your job wrote results to a file, those results will appear in that file.
Parallel applications may require special care in the selection of PBS resources. Please refer to the sections that follow for details on how to run parallel applications with various parallel libraries.
The command qstat -a will list all jobs currently queued or running and some information about each:
$ qstat -a
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
107025.radon user123 workq hello -- 1 8 -- 00:05 Q --
115505.radon user456 ncn job4 5601 1 1 -- 600:0 R 575:0
...
189479.radon user456 workq AR4b -- 5 40 -- 04:00 H --
189481.radon user789 workq STDIN 1415 1 1 -- 00:30 R 00:07
189483.radon user789 workq STDIN 1758 1 1 -- 00:30 R 00:07
189484.radon user456 workq AR4b -- 5 40 -- 04:00 H --
189485.radon user456 workq AR4b -- 5 40 -- 04:00 Q --
189486.radon user123 tg_workq STDIN -- 1 1 -- 12:00 Q --
189490.radon user456 workq job7 26655 1 8 -- 04:00 R 00:06
189491.radon user123 workq job11 -- 1 8 -- 04:00 Q --
The status of each job listed appears in the "S" column toward the right. Possible status codes are: "Q" = Queued, "R" = Running, "C" = Completion, and "H" = Held.
To see only your own jobs, use the -u option to qstat and specify your own username:
$ qstat -a -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- ---------- -------- ---------- ------ --- --- ------ ----- - -----
182792.radon myusername workq job1 28422 1 4 -- 23:00 R 20:19
185841.radon myusername workq job2 24445 1 4 -- 23:00 R 20:19
185844.radon myusername workq job3 12999 1 4 -- 23:00 R 20:18
185847.radon myusername workq job4 13151 1 4 -- 23:00 R 20:18
To retrieve useful information about your queued or running job, use the checkjob command with your job's ID number. The output should look similar to the following:
$ checkjob -v 163000 job 163000 (RM job '163000.radon-adm.rcac.purdue.edu') AName: test State: Idle Creds: user:myusername group:mygroup class:myqueue WallTime: 00:00:00 of 20:00:00 SubmitTime: Wed Apr 18 09:08:37 (Time Queued Total: 1:24:36 Eligible: 00:00:23) NodeMatchPolicy: EXACTNODE Total Requested Tasks: 2 Total Requested Nodes: 1 Req[0] TaskCount: 2 Partition: ALL TasksPerNode: 2 NodeCount: 1 Notification Events: JobFail IWD: /home/myusername/gaussian UMask: 0000 OutputFile: radon-fe00.rcac.purdue.edu:/home/myusername/gaussian/test.o163000 ErrorFile: radon-fe00.rcac.purdue.edu:/home/myusername/gaussian/test.e163000 User Specified Partition List: radon-adm,SHARED Partition List: radon-adm SrcRM: radon-adm DstRM: radon-adm DstRMJID: 163000.radon-adm.rcac.purdue.edu Submit Args: -l nodes=1:ppn=2,walltime=20:00:00 -q myqueue Flags: RESTARTABLE Attr: checkpoint StartPriority: 1000 PE: 2.00 NOTE: job violates constraints for partition radon-adm (job 163000 violates active HARD MAXPROC limit of 160 for class myqueue partition ALL (Req: 2 InUse: 160)) BLOCK MSG: job 163000 violates active HARD MAXPROC limit of 160 for class myqueue partition ALL (Req: 2 InUse: 160) (recorded at last scheduling iteration)
There are several useful bits of information in this output.
To stop a job before it finishes or remove it from a queue, use the qdel command:
$ qdel myjobid
You find the job ID using the qstat command as explained in the PBS Job Status section.
To submit jobs successfully, you must understand how to request the right computing resources. This section contains examples of specific types of PBS jobs. These examples illustrate requesting various groupings of nodes and processor cores, using various parallel libraries, and running interactive jobs. You may wish to look here for an example that is most similar to your application and use a modified version of that example's job submission file for your jobs.
This simple example submits the job submission file hello.sub to the workq queue on Radon and requests 4 nodes:
$ qsub -q workq -l nodes=4,walltime=00:01:00 hello.sub 99.radon-adm.rcac.purdue.edu
Remember that ppn can not be larger than the number of processor cores on each node.
After your job finishes running, the ls command will show two new files in your directory, the .o and .e files:
$ ls -l hello hello.c hello.out hello.sub hello.sub.e99 hello.sub.o99
If everything went well, then the file hello.sub.e99 will be empty, since it contains any error messages your program gave while running. The file hello.sub.o99 contains the output from your program.
If you would like to see the value of the environment variables from within a PBS job, you can prepare a job submission file with an appropriate filename, here named env.sub:
#!/bin/sh -l # FILENAME: env.sub # Request four nodes, 1 processor core on each. #PBS -l nodes=4:ppn=1,walltime=00:01:00 # Change to the directory from which you submitted your job. cd $PBS_O_WORKDIR # Show details, especially nodes. # The results of most of the following commands appear in the error file. echo $PBS_O_HOST echo $PBS_O_QUEUE echo $PBS_O_SYSTEM echo $PBS_O_WORKDIR echo $PBS_ENVIRONMENT echo $PBS_JOBID echo $PBS_JOBNAME # PBS_NODEFILE contains the names of assigned compute nodes. cat $PBS_NODEFILE
Submit this job:
$ qsub env.sub
This section illustrates various requests for one or multiple compute nodes and ways of allocating the processor cores on these compute nodes. Each example submits a job submission file (myjobsubmissionfile.sub) to a batch session. The job submission file contains a single command cat $PBS_NODEFILE to show the names of the compute node(s) allocated. The list of compute node names indicates the geometry chosen for the job:
#!/bin/sh -l # FILENAME: myjobsubmissionfile.sub cat $PBS_NODEFILE
All examples use the default queue of the cluster.
One processor core on any compute node
A job shares the other resources, in particular the memory, of the compute node with other jobs. This request is typical of a serial job:
$ qsub -l nodes=1 myjobsubmissionfile.sub
Compute node allocated:
radon-a639
Two processor cores on any compute nodes
This request is typical of a distributed-memory (MPI) job:
$ qsub -l nodes=2 myjobsubmissionfile.sub
Compute node(s) allocated:
radon-a639 radon-a638
All processor cores on one compute node
The option ppn can not be larger than the number of cores on each compute node on the machine in question. This request is typical of a shared-memory (OpenMP) job:
$ qsub -l nodes=1:ppn=8 myjobsubmissionfile.sub
Compute node allocated:
radon-a637 radon-a637 radon-a637 radon-a637 radon-a637 radon-a637 radon-a637 radon-a637 radon-a637
All processor cores on any two compute nodes
The option ppn can not be larger than the number of processor cores on each compute node on the machine in question. This request is typical of a hybrid (distributed-memory and shared-memory) job:
$ qsub -l nodes=2:ppn=8 myjobsubmissionfile.sub
Compute nodes allocated:
radon-a639 radon-a639 radon-a639 radon-a639 radon-a639 radon-a639 radon-a639 radon-a639 radon-a639 radon-a638 radon-a638 radon-a638 radon-a638 radon-a638 radon-a638 radon-a638 radon-a638 radon-a638
Multinode geometry from option nodes is one processor core per node (scattered placement)
$ qsub -l nodes=8 myjobsubmissionfile.sub
radon-a001 radon-a003 radon-a004 radon-a005 radon-a006 radon-a007 radon-a008 radon-a009
Multinode geometry from option procs is one or more processor cores per node (free placement)
$ qsub -l procs=8 myjobsubmissionfile.sub
The placement of processor cores can range from all on one compute node (packed) to all on unique compute nodes (scattered). A few examples follow:
radon-a001 radon-a001 radon-a001 radon-a001 radon-a001 radon-a001 radon-a001 radon-a001
radon-a001 radon-a001 radon-a001 radon-a002 radon-a002 radon-a003 radon-a004 radon-a004
radon-a000 radon-a001 radon-a002 radon-a003 radon-a004 radon-a005 radon-a006 radon-a007
Four compute nodes, each with two processor cores
$ qsub -l nodes=4:ppn=2 myjobsubmissionfile.sub
radon-a001 radon-a001 radon-a003 radon-a003 radon-a004 radon-a004 radon-a005 radon-a005
Eight processor cores can come from any four compute nodes
$ qsub -l nodes=4 -l procs=8 myjobsubmissionfile.sub
radon-a001 radon-a001 radon-a003 radon-a003 radon-a004 radon-a004 radon-a005 radon-a005
Exclusive access to one compute node, using one processor core
Achieving this geometry requires modifying the job submission file, here named myjobsubmissionfile.sub:
#!/bin/sh -l # FILENAME: myjobsubmissionfile.sub cat $PBS_NODEFILE uniq <$PBS_NODEFILE >nodefile echo " " cat nodefile
To gain exclusive access to a compute node, specify all processor cores that are physically available on a compute node:
$ qsub -l nodes=1:ppn=8 myjobsubmissionfile.sub
radon-a005 radon-a005 ... radon-a005 radon-a005
This request is typical of a serial job that needs access to all of the memory of a compute node.
You may also request that a job be run on specific nodes based on various quantities such as node memory.
These examples submit a job submission file, here named myjobsubmissionfile.sub, to the default queue. The job submission file contains a single command (cat $PBS_NODEFILE) to show the allocated node(s).
Example: request a node of Radon by requesting 16 GB of memory:
$ qsub -l nodes=1:pmem=16G myjobsubmissionfile.sub
Node allocated:
radon-d000
Interactive jobs can run on compute nodes. You can start interactive jobs either with specific time constraints (walltime=hh:mm:ss) or with the default time constraints of the queue to which you submit your job. PBS assigns to all jobs, even interactive jobs, the maximum wall time of their queue.
If you request an interactive job without a wall time option, PBS assigns to your job the default wall time limit for the queue to which you submit. If this is shorter than the time you actually need, your job will terminate before completion. If, on the other hand, this time is longer than what you actually need, you are effectively withholding computing resources from other users. For this reason, it is best to always pass a reasonable wall time value to PBS for interactive jobs.
Once your interactive job starts, you may use that connection as an interactive shell and invoke whatever other programs or other commands you wish. To submit an interactive job with one minute of wall time, use the -I option to qsub:
$ qsub -I -l walltime=00:01:00 qsub: waiting for job 100.radon-adm.rcac.purdue.edu to start qsub: job 100.radon-adm.rcac.purdue.edu ready
If you need to use a remote X11 display from within your job (see the SSH X11 Forwarding Section), add the -v DISPLAY option to qsub as well:
$ qsub -I -l walltime=00:01:00 -v DISPLAY qsub: waiting for job 101.radon-adm.rcac.purdue.edu to start qsub: job 101.radon-adm.rcac.purdue.edu ready
To quit your interactive job:
logout
A serial job is a single process whose steps execute as a sequential stream of instructions on one processor core.
This section illustrates how to use PBS to submit to a batch session one of the serial programs compiled in the section Compiling Serial Programs. There is no difference in running a Fortran, C, or C++ serial program after compiling and linking it into an executable file.
Suppose that you named your executable file serial_hello. Prepare a job submission file with an appropriate filename, here named serial_hello.sub:
#!/bin/sh -l # FILENAME: serial_hello.sub module load devel cd $PBS_O_WORKDIR ./serial_hello
Since PBS always sets the working directory to your home directory, you should either execute the cd $PBS_O_WORKDIR command, which will set the run-time current working directory to the directory from which you submitted the job submission file via the qsub command, or give the full path to the directory containing the executable program.
Submit the serial job to the default queue on Radon and request 1 compute node with 1 processor core and 1 minute of wall time. Requesting the default queue does not require explicitly asking for it. Job completion can take a while depending on the demand placed on the compute cluster:
$ qsub -l nodes=1:ppn=1,walltime=00:01:00 ./serial_hello.sub
View two new files in your directory (.o and .e):
$ ls -l serial_hello serial_hello.c serial_hello.sub serial_hello.sub.emyjobid serial_hello.sub.omyjobid
View results in the output file:
$ cat serial_hello.sub.omyjobid Runhost:radon-a639.rcac.purdue.edu hello, world
If the job failed to run, then view error messages in the file serial_hello.sub.emyjobid.
If a serial job uses a lot of memory and finds the memory of a compute node overcommitted while sharing the compute node with other jobs, specify the number of processor cores physically available on the compute node to gain exclusive use of the compute node:
$ qsub -l nodes=1:ppn=8,walltime=00:01:00 serial_hello.sub
View results in the output file:
$ cat serial_hello.sub.omyjobid Runhost:radon-a639.rcac.purdue.edu hello, world
A message-passing job is a set of processes (often multiple copies of a single process) that take advantage of distributed-memory systems by communicating with each other via the sending and receiving of messages. Work occurs across several compute nodes of a distributed-memory system. The Message-Passing Interface (MPI) is a specific implementation of the message-passing model and is a collection of library functions. Open MPI, MPICH2, and MVAPICH2 are three implementations of the MPI-2 standard.
This section illustrates how to use PBS to submit to a batch session one of the MPI programs compiled in the section Compiling MPI Programs. There is no difference in running a Fortran, C, or C++ serial program after compiling and linking it into an executable file.
The path to relevant MPI libraries is not setup on any run host by default. Using module load is the preferred way to access these libraries. Use module avail to see all software packages installed on Radon, including MPI library packages. Then, to employ one of the available MPI modules, enter the module load command.
Suppose that you named your executable file mpi_hello. Prepare a job submission file with an appropriate filename, here named mpi_hello.sub:
#!/bin/sh -l # FILENAME: mpi_hello.sub module load devel cd $PBS_O_WORKDIR mpiexec -n 16 ./mpi_hello
You can load any MPI library/compiler module that is available on Radon (This example uses the recommended library Open MPI).
Since PBS always sets the working directory to your home directory, you should either execute the cd $PBS_O_WORKDIR command, which will set the run-time current working directory to the directory from which you submitted the job submission file via the qsub command, or give the full path to the directory containing the executable program.
You invoke an MPI program with the mpiexec command. The number of processes requested with mpiexec -n is usually equal to the number of MPI ranks of the application and should typically be equal to the total number of processor cores you request from PBS (more on this below).
Submit the MPI job to the default queue on Radon and request 2 compute nodes with all 8 processor cores and 8 MPI ranks on each compute node and 1 minute of wall time. This will use two complete compute nodes of the Radon cluster. Requesting the default queue does not require explicitly asking for it. Job completion can take a while depending on the demand placed on the compute cluster.
$ qsub -l nodes=2:ppn=8,walltime=00:01:00 ./mpi_hello.sub
View two new files in your directory (.o and .e):
$ ls -l mpi_hello mpi_hello.c mpi_hello.sub mpi_hello.sub.emyjobid mpi_hello.sub.omyjobid
View results in the output file:
$ cat mpi_hello.sub.omyjobid Runhost:radon-a010.rcac.purdue.edu Rank:0 of 16 ranks hello, world Runhost:radon-a010.rcac.purdue.edu Rank:1 of 16 ranks hello, world ... Runhost:radon-a010.rcac.purdue.edu Rank:7 of 16 ranks hello, world Runhost:radon-a011.rcac.purdue.edu Rank:8 of 16 ranks hello, world Runhost:radon-a011.rcac.purdue.edu Rank:9 of 16 ranks hello, world ... Runhost:radon-a011.rcac.purdue.edu Rank:15 of 16 ranks hello, world
If the job failed to run, then view error messages in the file mpi_hello.sub.emyjobid.
If an MPI job uses a lot of memory and 8 MPI ranks per compute node overcommit the memory of the compute nodes, specify more compute nodes (MPI ranks) and fewer processor cores on each compute node, while keeping the total number of MPI ranks unchanged.
Submit the job to the default queue with double the number of compute nodes and half the number of processor cores and MPI ranks per compute node (the total number of MPI ranks remains unchanged):
$ qsub -l nodes=4:ppn=4,walltime=00:01:00 ./mpi_hello.sub
View results in the output file:
$ cat mpi_hello.sub.omyjobid Runhost:radon-c010.rcac.purdue.edu Rank:0 of 16 ranks hello, world Runhost:radon-c010.rcac.purdue.edu Rank:1 of 16 ranks hello, world ... Runhost:radon-c010.rcac.purdue.edu Rank:3 of 16 ranks hello, world Runhost:radon-c011.rcac.purdue.edu Rank:4 of 16 ranks hello, world Runhost:radon-c011.rcac.purdue.edu Rank:5 of 16 ranks hello, world ... Runhost:radon-c011.rcac.purdue.edu Rank:7 of 16 ranks hello, world Runhost:radon-c012.rcac.purdue.edu Rank:8 of 16 ranks hello, world Runhost:radon-c012.rcac.purdue.edu Rank:9 of 16 ranks hello, world ... Runhost:radon-c012.rcac.purdue.edu Rank:11 of 16 ranks hello, world Runhost:radon-c013.rcac.purdue.edu Rank:12 of 16 ranks hello, world Runhost:radon-c013.rcac.purdue.edu Rank:13 of 16 ranks hello, world ... Runhost:radon-c013.rcac.purdue.edu Rank:15 of 16 ranks hello, world
The example shares the computes nodes with other jobs. This sharing may still overcommit the memory.
To scatter 4 MPI ranks to 4 different compute nodes with each MPI rank having exclusive use of its compute node, apply the Linux command uniq to make a list of unique compute node names:
#!/bin/sh -l # FILENAME: mpi_hello.sub module load devel cd $PBS_O_WORKDIR uniq <$PBS_NODEFILE >nodefile mpiexec -n 4 -machinefile nodefile ./mpi_hello
$ qsub -l nodes=4:ppn=8,walltime=00:01:00 ./mpi_hello.sub
Runhost: radon-a637.rcac.purdue.edu Rank: 0 of 4 ranks hello, world Runhost: radon-a636.rcac.purdue.edu Rank: 1 of 4 ranks hello, world Runhost: radon-a634.rcac.purdue.edu Rank: 2 of 4 ranks hello, world Runhost: radon-a633.rcac.purdue.edu Rank: 3 of 4 ranks hello, world
To distribute 8 MPI ranks to 4 different compute nodes with pairs of MPI ranks having exclusive use of their compute nodes, modify the output of uniq with pairs of compute node names:
#!/bin/sh -l
# FILENAME: rankspernode
# For each unique compute node name, output two copies.
while read LINE; do
echo $LINE
echo $LINE
done
#!/bin/sh -l
# FILENAME: mpi_hello.sub
module load devel
cd $PBS_O_WORKDIR
uniq <$PBS_NODEFILE | ./rankspernode >nodefile
mpiexec -n 8 -machinefile nodefile ./mpi_hello
$ qsub -l nodes=4:ppn=8,walltime=00:01:00 ./mpi_hello.sub
Runhost: radon-a135.rcac.purdue.edu Rank: 0 of 4 ranks hello, world Runhost: radon-a135.rcac.purdue.edu Rank: 1 of 4 ranks hello, world Runhost: radon-a136.rcac.purdue.edu Rank: 2 of 4 ranks hello, world Runhost: radon-a136.rcac.purdue.edu Rank: 3 of 4 ranks hello, world Runhost: radon-a137.rcac.purdue.edu Rank: 4 of 4 ranks hello, world Runhost: radon-a137.rcac.purdue.edu Rank: 5 of 4 ranks hello, world Runhost: radon-a138.rcac.purdue.edu Rank: 6 of 4 ranks hello, world Runhost: radon-a138.rcac.purdue.edu Rank: 7 of 4 ranks hello, world
Notes
For an introductory tutorial on how to write your own MPI programs:
A shared-memory job is a single process that takes advantage of a multi-core processor and its shared memory to achieve a form of parallel computing called multithreading. It distributes the work of a process over several processor cores of a multi-core processor. Open Multi-Processing (OpenMP) is a specific implementation of the shared-memory model and is a collection of parallelization directives, library routines, and environment variables.
This section illustrates how to use PBS to submit to a batch session one of the OpenMP programs, either task parallelism or loop-level (data) parallelism, compiled in the section Compiling OpenMP Programs. There is no difference in running a Fortran, C, or C++ OpenMP program after compiling and linking it into an executable file.
The OpenMP runtime library automatically creates the optimal number of threads for execution in parallel on the multiple processor cores of a compute node. If you are running the program on a system with only one processor, you will not see any speedup. In fact, the program may run more slowly due to the overhead in the synchronization code generated by the compiler. For best performance, the number of threads should typically be equal to the number of processor cores you will be using.
When running OpenMP programs, all threads should be on the same compute node to take advantage of shared memory.
To run an OpenMP program, set the environment variable OMP_NUM_THREADS to the desired number of threads:
In csh:
$ setenv OMP_NUM_THREADS mynumberofthreads
In bash:
$ export OMP_NUM_THREADS=mynumberofthreads
You should also set the environment variable PARALLEL to 1. This variable must be set or else any timers used by the program will return incorrect timings (see the etime man page for more details).
Suppose that you named your executable file omp_hello. Prepare a job submission file with an appropriate name, here named omp_hello.sub:
#!/bin/sh -l # FILENAME: omp_hello.sub module load devel cd $PBS_O_WORKDIR export OMP_NUM_THREADS=8 ./omp_hello
Since PBS always sets the working directory to your home directory, you should either execute the cd $PBS_O_WORKDIR command, which will set the run-time current working directory to the directory from which you submitted the job submission file via the qsub command, or give the full path to the directory containing the program.
Submit the OpenMP job to the default queue on Radon and request 1 complete compute node with all 8 processor cores (OpenMP threads) on the compute node and 1 minute of wall time. This will use one complete compute node of the Radon cluster. Requesting the default queue does not require explicitly asking for it. Job completion can take a while depending on the demand placed on the compute cluster.
$ qsub -l nodes=1:ppn=8,walltime=00:01:00 omp_hello.sub
View two new files in your directory (.o and .e):
$ ls -l omp_hello omp_hello.c omp_hello.sub omp_hello.sub.emyjobid omp_hello.sub.omyjobid
View the results from one of the sample OpenMP programs about task parallelism:
$ cat omp_hello.sub.omyjobid SERIAL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 8 threads hello, world PARALLEL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:1 of 8 threads hello, world ... PARALLEL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:7 of 8 threads hello, world SERIAL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 1 thread hello, world
If the job failed to run, then view error messages in the file omp_hello.sub.emyjobid.
If an OpenMP program uses a lot of memory and 8 threads overcommit the memory of the compute node, specify fewer processor cores (OpenMP threads) on that compute node.
Modify the job submission file omp_hello.sub to use half the number of processor cores:
#!/bin/sh -l # FILENAME: omp_hello.sub module load devel cd $PBS_O_WORKDIR export OMP_NUM_THREADS=4 ./omp_hello
Submit the job to the default queue with half the number of processor cores:
$ qsub -l nodes=1:ppn=4,walltime=00:01:00 omp_hello.sub
View the results from one of the sample OpenMP programs about task parallelism and using half the number of processor cores:
$ cat omp_hello.sub.omyjobid SERIAL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 4 threads hello, world PARALLEL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:1 of 4 threads hello, world ... PARALLEL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:3 of 4 threads hello, world SERIAL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 1 thread hello, world
To retain exclusive use of a compute node while using fewer OpenMP threads than the number of processor cores physically available on that compute node:
#!/bin/sh -l # FILENAME: omp_hello.sub module load devel cd $PBS_O_WORKDIR export OMP_NUM_THREADS=8 uniq <$PBS_NODEFILE >nodefile ./omp_hello
$ qsub -l nodes=1:ppn=16,walltime=00:01:00 omp_hello.sub
SERIAL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:0 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:1 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:2 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:3 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:4 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:5 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:6 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:7 of 8 threads hello, world SERIAL REGION: Runhost:radon-a639.rcac.purdue.edu Thread:0 of 1 thread hello, world
Practice submitting the sample OpenMP program about loop-level (data) parallelism:
#!/bin/sh -l # FILENAME: omp_loop.sub module load devel cd $PBS_O_WORKDIR export OMP_NUM_THREADS=8 ./omp_loop
$ qsub -l nodes=1:ppn=8,walltime=00:01:00 omp_loop.sub
SERIAL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 1 thread hello, world PARALLEL LOOP: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 8 threads Iteration:0 hello, world PARALLEL LOOP: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 8 threads Iteration:1 hello, world PARALLEL LOOP: Runhost:radon-c044.rcac.purdue.edu Thread:1 of 8 threads Iteration:2 hello, world PARALLEL LOOP: Runhost:radon-c044.rcac.purdue.edu Thread:1 of 8 threads Iteration:3 hello, world ... PARALLEL LOOP: Runhost:radon-c044.rcac.purdue.edu Thread:7 of 8 threads Iteration:14 hello, world PARALLEL LOOP: Runhost:radon-c044.rcac.purdue.edu Thread:7 of 8 threads Iteration:15 hello, world SERIAL REGION: Runhost:radon-c044.rcac.purdue.edu Thread:0 of 1 thread hello, world
A hybrid job combines both message-passing and shared-memory attributes to take advantage of distributed-memory systems with multi-core processors. Work occurs across several compute nodes of a distributed-memory system and across the processor cores of the multi-core processors.
This section illustrates how to use PBS to submit to a batch session one of the hybrid programs compiled in the section Compiling Hybrid Programs. There is no difference in running a Fortran, C, or C++ hybrid program after compiling and linking it into an executable file.
The path to relevant MPI libraries is not setup on any run host by default. Using module load is the preferred way to access these libraries. Use module avail to see all software packages installed on Radon, including MPI library packages. Then, to employ one of the available MPI modules, enter the module load command.
The OpenMP runtime library automatically creates the optimal number of threads for execution in parallel on the multiple processor cores of a compute node. If you are running the program on a system with only one processor, you will not see any speedup. In fact, the program may run more slowly due to the overhead in the synchronization code generated by the compiler. For best performance, the number of threads should typically be equal to the number of processor cores you will be using.
When running hybrid programs, use all processor cores of the compute nodes to take advantage of shared memory.
To run a hybrid program, set the environment variable OMP_NUM_THREADS to the desired number of threads:
In csh:
$ setenv OMP_NUM_THREADS mynumberofthreads
In bash:
$ export OMP_NUM_THREADS=mynumberofthreads
You should also set the environment variable PARALLEL to 1. This variable must be set or else any timers used by the program will return incorrect timings (see the etime man page for more details).
Suppose that you named your executable file hybrid_hello. Prepare a job submission file with an appropriate filename, here named hybrid_hello.sub:
#!/bin/sh -l # FILENAME: hybrid_hello.sub module load devel cd $PBS_O_WORKDIR uniq <$PBS_NODEFILE >nodefile export OMP_NUM_THREADS=8 mpiexec -n 2 -machinefile nodefile ./hybrid_hello
You can load any MPI library/compiler module that is available on Radon. This example uses the recommended library Open MPI.
Since PBS always sets the working directory to your home directory, you should either execute the cd $PBS_O_WORKDIR command, which will set the run-time current working directory to the directory from which you submitted the job submission file via the qsub command, or give the full path to the directory containing the executable program.
You invoke a hybrid program with the mpiexec command. The number of processes requested with mpiexec -n is usually equal to the number of MPI ranks of the application (more on this below).
Submit the hybrid job to the default queue on Radon and request 2 compute nodes with 1 MPI rank and all 8 processor cores (OpenMP threads) on each compute node and 1 minute of wall time. This will use two complete compute nodes of the Radon cluster. Requesting the default queue does not require explicitly asking for it. Job completion can take a while depending on the demand placed on the compute cluster.
$ qsub -l nodes=2:ppn=8,walltime=00:01:00 hybrid_hello.sub 179168.radon-adm.rcac.purdue.edu
View two new files in your directory (.o and .e):
$ ls -l hybrid_hello hybrid_hello.c hybrid_hello.sub hybrid_hello.sub.emyjobid hybrid_hello.sub.omyjobid
View the results from one of the sample hybrid programs about task parallelism:
$ cat hybrid_hello.sub.omyjobid SERIAL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 2 ranks, Thread:1 of 8 threads hello, world ... PARALLEL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 2 ranks, Thread:7 of 8 threads hello, world SERIAL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world SERIAL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 2 ranks, Thread:1 of 8 threads hello, world ... PARALLEL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 2 ranks, Thread:7 of 8 threads hello, world SERIAL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world
If the job failed to run, then view error messages in the file hybrid_hello.sub.emyjobid.
If a hybrid job uses a lot of memory and 8 OpenMP threads per compute node overcommit the memory of the compute nodes, specify more compute nodes (MPI ranks) and fewer processor cores (OpenMP threads) on each compute node.
Prepare a job submission file with double the number of compute nodes (MPI ranks) and half the number of processor cores (OpenMP threads):
#!/bin/sh -l # FILENAME: hybrid_hello.sub module load devel cd $PBS_O_WORKDIR uniq <$PBS_NODEFILE >nodefile export OMP_NUM_THREADS=4 mpiexec -n 4 -machinefile nodefile ./hybrid_hello
Submit the job to the default queue on Radon with double the number of compute nodes (MPI ranks) and half the number of processor cores (OpenMP threads):
$ qsub -l nodes=4:ppn=4,walltime=00:01:00 hybrid_hello.sub
View the results from one of the sample hybrid programs about task parallelism with double the number of compute nodes (MPI ranks) and half the number of processor cores (OpenMP threads):
$ cat hybrid_hello.sub.omyjobid SERIAL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 4 ranks, Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 4 ranks, Thread:0 of 4 threads hello, world PARALLEL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 4 ranks, Thread:1 of 4 threads hello, world ... PARALLEL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 4 ranks, Thread:3 of 4 threads hello, world SERIAL REGION: Runhost:radon-a020.rcac.purdue.edu Rank:0 of 4 ranks, Thread:0 of 1 thread hello, world SERIAL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 4 ranks, Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 4 ranks, Thread:0 of 4 threads hello, world PARALLEL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 4 ranks, Thread:1 of 4 threads hello, world ... PARALLEL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 4 ranks, Thread:3 of 4 threads hello, world SERIAL REGION: Runhost:radon-a021.rcac.purdue.edu Rank:1 of 4 ranks, Thread:0 of 1 thread hello, world SERIAL REGION: Runhost:radon-a022.rcac.purdue.edu Rank:2 of 4 ranks, Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a022.rcac.purdue.edu Rank:2 of 4 ranks, Thread:0 of 4 threads hello, world PARALLEL REGION: Runhost:radon-a022.rcac.purdue.edu Rank:2 of 4 ranks, Thread:1 of 4 threads hello, world ... PARALLEL REGION: Runhost:radon-a022.rcac.purdue.edu Rank:2 of 4 ranks, Thread:3 of 4 threads hello, world SERIAL REGION: Runhost:radon-a022.rcac.purdue.edu Rank:2 of 4 ranks, Thread:0 of 1 thread hello, world SERIAL REGION: Runhost:radon-a023.rcac.purdue.edu Rank:3 of 4 ranks, Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a023.rcac.purdue.edu Rank:3 of 4 ranks, Thread:0 of 4 threads hello, world PARALLEL REGION: Runhost:radon-a023.rcac.purdue.edu Rank:3 of 4 ranks, Thread:1 of 4 threads hello, world ... PARALLEL REGION: Runhost:radon-a023.rcac.purdue.edu Rank:3 of 4 ranks, Thread:3 of 4 threads hello, world SERIAL REGION: Runhost:radon-a023.rcac.purdue.edu Rank:3 of 4 ranks, Thread:0 of 1 thread hello, world
To retain exclusive use of compute nodes while using fewer OpenMP threads than the number of processor cores physically available on each compute node:
#!/bin/sh -l # FILENAME: omp_hello.sub module load devel cd $PBS_O_WORKDIR uniq <$PBS_NODEFILE >nodefile export OMP_NUM_THREADS=4 ./omp_hello
$ qsub -l nodes=2:ppn=8,walltime=00:01:00 hybrid_hello.sub
SERIAL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:1 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:2 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:3 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:4 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:5 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:6 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:7 of 8 threads hello, world SERIAL REGION: Runhost:radon-a637.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world SERIAL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world PARALLEL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:1 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:2 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:3 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:4 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:5 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:6 of 8 threads hello, world PARALLEL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:7 of 8 threads hello, world SERIAL REGION: Runhost:radon-a634.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world
Practice submitting the sample OpenMP program about loop-level (data) parallelism:
#!/bin/sh -l # FILENAME: hybrid_loop.sub module load devel cd $PBS_O_WORKDIR uniq <$PBS_NODEFILE >nodefile export OMP_NUM_THREADS=8 mpiexec -n 2 -machinefile nodefile ./hybrid_loop
$ qsub -l nodes=2:ppn=16,walltime=00:01:00 hybrid_loop.sub
SERIAL REGION: Runhost:radon-a044.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world PARALLEL LOOP: Runhost:radon-a044.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 8 threads Iteration:0 hello, world PARALLEL LOOP: Runhost:radon-a044.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 8 threads Iteration:1 hello, world PARALLEL LOOP: Runhost:radon-a044.rcac.purdue.edu Rank:0 of 2 ranks, Thread:1 of 8 threads Iteration:2 hello, world PARALLEL LOOP: Runhost:radon-a044.rcac.purdue.edu Rank:0 of 2 ranks, Thread:1 of 8 threads Iteration:3 hello, world ... PARALLEL LOOP: Runhost:radon-a044.rcac.purdue.edu Rank:0 of 2 ranks, Thread:7 of 8 threads Iteration:14 hello, world PARALLEL LOOP: Runhost:radon-a044.rcac.purdue.edu Rank:0 of 2 ranks, Thread:7 of 8 threads Iteration:15 hello, world SERIAL REGION: Runhost:radon-a044.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world SERIAL REGION: Runhost:radon-a045.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world PARALLEL LOOP: Runhost:radon-a045.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 8 threads Iteration:0 hello, world PARALLEL LOOP: Runhost:radon-a045.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 8 threads Iteration:1 hello, world PARALLEL LOOP: Runhost:radon-a045.rcac.purdue.edu Rank:1 of 2 ranks, Thread:1 of 8 threads Iteration:2 hello, world PARALLEL LOOP: Runhost:radon-a045.rcac.purdue.edu Rank:1 of 2 ranks, Thread:1 of 8 threads Iteration:3 hello, world ... PARALLEL LOOP: Runhost:radon-a045.rcac.purdue.edu Rank:1 of 2 ranks, Thread:7 of 8 threads Iteration:14 hello, world PARALLEL LOOP: Runhost:radon-a045.rcac.purdue.edu Rank:1 of 2 ranks, Thread:7 of 8 threads Iteration:15 hello, world SERIAL REGION: Runhost:radon-a045.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world
Notes
Some applications process data stored in a large input data file. The size of this file may be so large that it cannot fit within the quota of a home directory. This file might reside on Fortress or some other external storage medium. The way to process this file on Radon is to copy it to your scratch directory where a job running on a compute node of Radon may access it.
This section illustrates how to submit a small job which reads a data file which resides on the scratch file system. This example, myprogram.c, displays the name of the compute node which runs the job, the path name of the current working directory, the contents of that directory, and copies the contents of an input scratch file to an output scratch file. Linux commands access system information. To compile this program, see Compiling Serial Programs.
Prepare a scratch file directory with a large input data file:
$ ls -l $RCAC_SCRATCH total 96 -rw-r----- 1 myusername itap 27 Jun 8 10:41 mybiginputdatafile
Prepare a job submission file with the path to your scratch file directory listed as a command-line argument and with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load devel cd $PBS_O_WORKDIR ./myprogram $RCAC_SCRATCH
Submit this job to the default queue on Radon and request 1 processor core of 1 compute node and 1 minute of wall time. Requesting the default queue does not require explicitly asking for it.
$ qsub -l nodes=1,walltime=00:01:00 myjob.sub
View two new files in the home directory (.o and .e):
$ ls -l total 160 -rw-r--r-- 1 myusername itap 54 Jun 8 10:29 README -rw-r--r-- 1 myusername itap 136 Jun 8 11:04 myjob.sub -rw------- 1 myusername itap 0 Jun 8 11:05 myjob.sub.e266283 -rw------- 1 myusername itap 780 Jun 8 11:05 myjob.sub.o266283 -rwxr-xr-x 1 myusername itap 9526 Jun 8 11:04 myprogram* -rw-r--r-- 1 myusername itap 3930 Jun 8 11:13 myprogram.c
View one new file in the scratch file directory, bigoutputdatafile:
$ ls -l $RCAC_SCRATCH total 96 -rw-r----- 1 myusername itap 27 Jun 8 10:41 mybiginputdatafile -rw-r--r-- 1 myusername itap 42 Jun 8 11:05 mybigoutputdatafile
View results in the output file:
$ cat myjob.sub.o266283 Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. radon-d036.rcac.purdue.edu /home/myusername total 128 -rw-r--r-- 1 myusername itap 54 Jun 8 10:29 README -rw-r--r-- 1 myusername itap 136 Jun 8 11:04 myjob.sub -rwxr-xr-x 1 myusername itap 9526 Jun 8 11:04 myprogram -rw-r--r-- 1 myusername itap 3976 Jun 8 10:45 myprogram.c total 128 -rw-r--r-- 1 myusername itap 54 Jun 8 10:29 README -rw-r--r-- 1 myusername itap 136 Jun 8 11:04 myjob.sub -rwxr-xr-x 1 myusername itap 9526 Jun 8 11:04 myprogram -rw-r--r-- 1 myusername itap 3976 Jun 8 10:45 myprogram.c *** MAIN START *** input scratch file: /scratch/scratch95/m/myusername/mybiginputdatafile output scratch file: /scratch/scratch95/m/myusername/mybigoutputdatafile scratch file system: textfromscratchfile *** MAIN STOP ***
The output shows the name of the compute node which PBS chose to run the job, the path of the current working directory (the user's home directory), before-and-after listings of the content of the current working directory, and output from the application. The output scratch file named mybigoutdatafile, the primary output of this program, appears in the scratch directory, not the home directory.
Some applications write a large amount of intermediate data to a temporary file during an early part of the process then read that data for further processing during a later part of the process. The size of this file may be so large that it cannot fit within the quota of a home directory or that it requires too much I/O activity between the compute node and either the home directory or the scratch file directory. The way to process this intermediate file on Radon is to use the /tmp directory of the compute node which runs the job. Used properly, /tmp may provide faster local storage to an active process than any other storage option.
This section illustrates how to submit a small job which first writes then reads an intermediate data file which resides on the /tmp directory. This example, myprogram.c, displays the contents of the /tmp directory before and after processing. Linux commands access system information. To compile this program, see Compiling Serial Programs.
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load devel cd $PBS_O_WORKDIR ./myprogram
Submit this job to the default queue on Radon and request 1 processor core of 1 compute node and 1 minute of wall time. Requesting the default queue does not require explicitly asking for it:
$ qsub -l nodes=1,walltime=00:01:00 myjob.sub
View results in the output file, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. -rw-r--r-- 1 myusername itap 12 Jun 16 11:36 /tmp/mytmpfile *** MAIN START *** /tmp file data: abcdefghijk *** MAIN STOP ***
The output verifies the existence of the intermediate data file in the /tmp directory.
View results in the error file, myjob.sub.emyjobid:
ls: /tmp/mytmpfile: No such file or directory
The results in the error file verify that the intermediate data file does not exist at the start of processing.
While the /tmp directory can provide faster local storage to an active process than other storage options, you never know how much storage is available in the /tmp directory of the compute node chosen to run your job. If an intermediate data file consistently fails to fit in the /tmp directories of a set of compute nodes, consider limiting the pool of candidate compute nodes to those which can handle your intermediate data file.
Several commercial and third-party software packages are available on Radon and accessible through PBS.
We try to continually test the examples in the next few sectionss, but you may find some differences. If you need assistance, please contact us.
With the exception of Octave and R, which are free software, only Purdue affiliates may use the following licensed software.
Gaussian is a computational chemistry software package which works on electronic structure. This section illustrates how to submit a small Gaussian job to a PBS queue. This Gaussian example runs the Fletcher-Powell multivariable optimization.
Prepare a Gaussian input file with an appropriate filename, here named myjob.com. The final blank line is necessary:
#P TEST OPT=FP STO-3G OPTCYC=2 STO-3G FLETCHER-POWELL OPTIMIZATION OF WATER 0 1 O H 1 R H 1 R 2 A R 0.96 A 104.
To submit this job, load Gaussian then run the provided script, named subg09. This job uses one compute node with 8 processor cores:
$ module load gaussian09/B.01 $ subg09 myjob -l nodes=1:ppn=8
View job status:
$ qstat -u myusername
View results in the file for Gaussian output, here named myjob.log. Only the first and last few lines appear here:
Entering Gaussian System, Link 0=/apps/rhel5/g09-B.01/g09/g09
Initial command:
/apps/rhel5/g09-B.01/g09/l1.exe /scratch/scratch95/m/myusername/gaussian/Gau-7781.inp -scrdir=/scratch/scratch95/m/myusername/gaussian/
Entering Link 1 = /apps/rhel5/g09-B.01/g09/l1.exe PID= 7782.
Copyright (c) 1988,1990,1992,1993,1995,1998,2003,2009,2010,
Gaussian, Inc. All Rights Reserved.
.
.
.
Job cpu time: 0 days 0 hours 1 minutes 37.3 seconds.
File lengths (MBytes): RWF= 5 Int= 0 D2E= 0 Chk= 1 Scr= 1
Normal termination of Gaussian 09 at Wed Mar 30 10:49:02 2011.
real 17.11
user 92.40
sys 4.97
Machine:
radon-a389
radon-a389
radon-a389
radon-a389
radon-a389
radon-a389
radon-a389
radon-a389
The ppn= specification should be used as in the following. It does not affect the way the job runs, but it makes the #tasks entry in the qstat output appear correctly.
Submit job using 4 processor cores on a single node:
$ subg09 myjob -l nodes=1:ppn=4,walltime=200:00:00 -q myqueuename
Submit job using 4 processor cores on each of 2 nodes:
$ subg09 myjob -l nodes=2:ppn=4,walltime=200:00:00 -q myqueuename
Submit job using 8 processor cores on a single node:
$ subg09 myjob -l nodes=1:ppn=8,walltime=200:00:00 -q myqueuename
Submit job using 8 processor cores on each of 2 nodes:
$ subg09 myjob -l nodes=2:ppn=8,walltime=200:00:00 -q myqueuename
For more information about Gaussian:
Maple is a general-purpose computer algebra system. This section illustrates how to submit a small Maple job to a PBS queue. This Maple example differentiates, integrates, and finds the roots of polynomials.
Prepare a Maple input file with an appropriate filename, here named myjob.in:
# FILENAME: myjob.in # Differentiate wrt x. diff( 2*x^3,x ); # Integrate wrt x. int( 3*x^2*sin(x)+x,x ); # Solve for x. solve( 3*x^2+2*x-1,x );
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load maple cd $PBS_O_WORKDIR # Use the -q option to suppress startup messages. # maple -q myjob.in maple myjob.in
OR:
#!/bin/sh -l # FILENAME: myjob.sub module load maple # Use the -q option to suppress startup messages. # maple -q << EOF maple << EOF # Differentiate wrt x. diff( 2*x^3,x ); # Integrate wrt x. int( 3*x^2*sin(x)+x,x ); # Solve for x. solve( 3*x^2+2*x-1,x );
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, here named myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
2
6 x
2
2 x
-3 x cos(x) + 6 cos(x) + 6 x sin(x) + ----
2
1/3, -1
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about Maple:
Mathematica implements numeric and symbolic mathematics. This section illustrates how to submit a small Mathematica job to a PBS queue. This Mathematica example finds the three roots of a third-degree polynomial.
Prepare a Mathematica input file with an appropriate filename, here named myjob.in:
(* FILENAME: myjob.in *) (* Find roots of a polynomial. *) p=x^3+3*x^2+3*x+1 Solve[p==0] Quit
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load mathematica cd $PBS_O_WORKDIR math < myjob.in
OR:
#!/bin/sh -l # FILENAME: myjob.sub module load mathematica math << EOF (* Find roots of a polynomial. *) p=x^3+3*x^2+3*x+1 Solve[p==0] Quit
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, here named myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
Mathematica 5.2 for Linux x86 (64 bit)
Copyright 1988-2005 Wolfram Research, Inc.
-- Terminal graphics initialized --
In[1]:=
In[2]:=
In[2]:=
In[3]:=
2 3
Out[3]= 1 + 3 x + 3 x + x
In[4]:=
Out[4]= {{x -> -1}, {x -> -1}, {x -> -1}}
In[5]:=
View the standard error file, myjob.sub.emyjobid:
rmdir: ./ligo/rengel/tasks: Directory not empty rmdir: ./ligo/rengel: Directory not empty rmdir: ./ligo: Directory not empty
For more information about Mathematica:
MATLAB® (an acronym for MATrix LABoratory) is a general-purpose, high-level programming package which offers a fourth-generation programming language that enables computationally intensive tasks. It integrates a powerful programming language with computation and visualization to provide a flexible environment where problems and solutions appear in familiar mathematical notation. MATLAB allows the integration of external routines written in C, C++, Fortran, and Java with MATLAB applications. Built-in interfaces handle the importing of data from instruments, files, and external databases. MATLAB is a product of The MathWorks, a privately held company founded in 1984.
The MATLAB interpreter is the part of MATLAB which reads M-files and MEX-files and executes MATLAB statements. Simulink® is a graphical environment for simulation and Model-Based Design of multidomain dynamic and embedded systems. The Parallel Computing Toolbox (PCT) parallelizes MATLAB applications. The Distributed Computing Server (DCS) scales up PCT applications to compute clusters. Many other optional, add-on toolboxes (separately available collections of special-purpose MATLAB functions) extend the basic MATLAB package to solve particular classes of problems; they focus on individual areas of science and industry. The MATLAB Compiler™ (mcc) compiles a MATLAB application into a standalone program or software component. The term MATLAB can mean just the interpreter or the entire package.
Industries using MathWorks products include automobile, aerospace, communications, electronics, finance, industrial automation, and medicine. Areas of application include linear algebra and other calculations involving matrices or vectors of data, mathematics, statistics, signal processing, image processing, communications, control design, test and measurement, financial modeling and analysis, computational biology, algorithm development, simulation, data acquisition and analysis and visualization, and numeric and symbolic computation.
Purdue University has a system-wide license agreement with The MathWorks to use MATLAB for the purposes of teaching and research. Purdue's license number, your Purdue email address, and your MATLAB password provide access to MathWorks help desk, webinars, and other materials. matlabroot provides the path to the location where MATLAB is installed including the path to examples. To discover Purdue's license number, version details, and the path to examples:
$ module load matlab/R2011b $ module list 1) matlab/R2011b $ matlab -nodisplay >> license 819994 >> ver >> disp(matlabroot) /apps/rhel5/MATLAB/R2011b >> quit; $
MATLAB, Simulink, Compiler, and several of the optional toolboxes are available to faculty, staff, and students. To see the kind and quantity of all MATLAB licenses plus the number that you are currently using:
$ matlab_licenses
Licenses
MATLAB Product / Toolbox Name myusername Free Total
================================== ============================
Aerospace Blockset 0 10 10
Aerospace Toolbox 0 18 20
Bioinformatics Toolbox 0 19 20
Communication Toolbox 0 27 30
Compiler 0 14 15
Control Toolbox 0 60 75
Curve Fitting Toolbox 0 37 75
Data Acq Toolbox 0 10 10
Database Toolbox 0 5 5
Datafeed Toolbox 0 5 5
Dial and Gauge Blocks 0 14 25
Econometrics Toolbox 0 13 15
Excel Link 0 5 5
Financial Toolbox 0 13 15
Fixed-Point Blocks 0 5 5
Fixed Point Toolbox 0 11 20
Fuzzy Toolbox 0 9 10
GADS Toolbox 0 13 15
Identification Toolbox 0 15 15
Image Acquisition Toolbox 0 5 5
Image Toolbox 0 61 100
Instr Control Toolbox 0 10 15
MAP Toolbox 0 25 30
MATLAB 1 373 1,000
MATLAB Builder for dot Net 0 1 1
MATLAB Coder 0 25 25
MATLAB Distrib Comp Server 5 12 32
MATLAB Report Gen 0 2 2
MBC Toolbox 0 5 5
MPC Toolbox 0 4 5
Neural Network Toolbox 0 11 15
OPC Toolbox 0 1 1
Optimization Toolbox 0 91 125
Parallel Computing Toolbox 1 31 50
PDE Toolbox 0 13 15
Power System Blocks 0 21 30
Real-Time Win Target 0 8 15
Real-Time Workshop 0 4 25
Robust Toolbox 0 5 5
RTW Embedded Coder 0 15 15
Signal Blocks 0 28 30
Signal Toolbox 0 53 100
SimBiology 0 4 5
SimHydraulics 0 15 15
SimMechanics 0 4 5
Simscape 0 29 30
SIMULINK 0 65 100
Simulink Control Design 0 15 15
Simulink Design Optim 0 4 5
SIMULINK Report Gen 0 2 2
SL Verification Validation 0 4 5
Stateflow 0 13 15
Statistics Toolbox 0 31 100
Symbolic Toolbox 0 56 75
Virtual Reality Toolbox 0 5 5
Wavelet Toolbox 0 14 15
XPC Target 0 9 20
The table shows the kind and quantity of MATLAB licenses which Purdue owns. The second column lists the number of licenses that you are currently using. The third column is a snapshot of the number of licenses currently available. The fourth column shows the total number of licenses which Purdue owns for each product. The table illustrates that while there are many MATLAB licenses, access to toolboxes is limited. Since Purdue's community of MATLAB users shares these licenses, users should plan an effective strategy so as not to prevent others from gaining access to MATLAB resources.
To reduce the table above to how many MATLAB licenses your jobs are using while they are running:
$ matlab_licenses -u
Licenses
MATLAB Product / Toolbox Name myusername Free Total
================================== ===========================
MATLAB 1 373 1,000
MATLAB Distrib Comp Server 5 12 32
Parallel Computing Toolbox 1 31 50
MathWorks expects their customers to use MATLAB interactively on a laptop or desktop. Purdue purchased licenses to run MATLAB on Linux clusters, a world of batch processing. The batch world of Linux clusters is very different from the interactive world of laptops. This difference requires another approach when applying MATLAB to large and compute-intensive applications, since you share resources on the clusters. Consider the analogy of the book. You can buy your personal copy of a book, or you can use a copy from a library. You can buy the compute cycles of a laptop, or you can use the compute cycles of a cluster. You can buy a MATLAB license, or you can use a MATLAB license on one of Purdue's community clusters.
In this shared environment, you must act like a "nice" user. There are two opportunities to be a "nice" user.
First, consider where you run your MATLAB client. Yes, you can log on a front end of a cluster, use the module feature to load a MATLAB client, and interact with your MATLAB client via the command line prompt much like those who run MATLAB on a laptop. Purdue allows application development on the front end of a cluster. Once you finish development and you are ready to move your application to production, Purdue asks that you run your MATLAB application on the compute nodes of a cluster. This means either running your MATLAB client on a front end and using the MATLAB functions batch() or submit() or running your MATLAB client on a compute node. The latter method involves PBS and its qsub command to send a script to a compute node which runs MATLAB. It also involves making available any related M-files and data files to the compute nodes chosen to run your job. These methods avoid tying up the front end and preventing other users from accomplishing their development. So, this is one way to be a "nice" user.
The second opportunity to be a "nice" user is to consider how many MATLAB licenses which your job requires and for how long. When you run a MATLAB client, you are using one MATLAB license. When you run a MATLAB parfor loop or a MATLAB spmd statement in a MATLAB pool job, you are using at least one additional license which comes from the Parallel Computing Toolbox. Running this job in the local configuration requires no additional license. If you wish to use a scheduler like PBS to submit a MATLAB pool job to a compute node of a cluster, then you use yet more licenses which come from the MATLAB Distributed Computing Server. Running a MATLAB pool job with four MATLAB workers (labs) requires seven licenses (one MATLAB, one PCT, and 5 DCS licenses). At some point after development and before production, you should consider ways to reduce how many licenses your jobs use, for example using the local configuration or compiling your application. MathWorks allows linking MATLAB libraries to your compiled applications so that your jobs may run without using any MATLAB license. Also, MathWorks permits distributing standalone executables and software components royalty-free.
MATLAB distinguishes three types of jobs (and three corresponding constructors): distributed (createJob()), pool (createMatlabPoolJob()), and parallel (createParallelJob()). A distributed job is one or more independent, single-processor-core tasks of MATLAB statements. Tasks may be identical or different; however, they do not interact with each other, and they need not run simultaneously. Tasks are distributed to workers as the workers become available, so a worker might process one or more tasks in succession. A serial job is just a distributed job with a single task which one processor core executes once. Typically, distributed jobs run parameter sweeps (running the same code with different inputs). A distributed job is also known as a task-parallel or embarrassingly parallel job.
A pool job involves code that requires one of the workers to distribute work to the other workers. One worker oversees the work accomplished by the other workers. The parfor and spmd statements of a pool job are similar to the parallel loop and parallel region, respectively, of the OpenMP Standard. Typically, a pool job implements a for loop whose iterations are independent, many, and long running. A pool job can also implement codistributed arrays as a means of handling data arrays which are too large to fit into the memory of any one compute node.
A parallel job is a single task running concurrently (in parallel) on two or more processor cores. The copies of the task are not independent; they may interact with each other. It is similar to a program running the Message-Passing Interface (MPI Standard). A parallel job is also known as a data-parallel job.
A MATLAB program may call user code written in C, C++, or Fortran (MEX file). The reverse is also true. A user program written in C, C++, or Fortran may call MATLAB functions or user-defined functions written in the MATLAB language (standalone program). MATLAB also offers a compiler which allows you to share your MATLAB aplications as an executable or a shared library with end users outside the MATLAB environment.
A few core concepts of MATLAB organize your strategy when developing and submitting MATLAB jobs to a Linux compute cluster. The term MATLAB client simply refers to a running copy of MATLAB. A client may run on a front end or on a compute node. The location of a client is important since it can affect the kind and quantity of MATLAB licenses needed to run a job.
MATLAB has two kinds of schedulers: the 'local' scheduler and an installation specific scheduler. In Purdue's case, the latter is named 'torque'. The 'local' scheduler runs a MATLAB job on the processor core(s) of the same compute node that is running the client (either a front end or a compute node of the cluster). Development work may occur on a front end. The 'torque' scheduler runs a MATLAB job on compute node(s) different from the node running the client. When using either scheduler, you typically use the submit() function and a sequence of related functions to setup the details of your job submission. When using the 'torque' scheduler, you may specify options that usually appear on the PBS qsub command. Production work should occur on compute node(s), not on the front end.
MATLAB offers two kinds of configurations: the 'local' configuration and user-defined configurations. The 'local' configuration runs a MATLAB job on the processor core(s) of the same compute node that is running the client (either a front end or a compute node of the cluster). Development work may occur on a front end. To run a MATLAB job on compute node(s) different from the node running the client, you must define your own configurations with the Configuration Manager. You find the Configuration Manager in the Parallel menu; MATLAB offers no set of functions equivalent to the Configuration Manager. When using either configuration, you typically use the batch() function to setup the details of your job submission. When using your own configuration, you may specify options that usually appear on the PBS qsub command. If your application runs best with a different compute node topology when you provide different initial conditions, you may be submitting that job with more than one user-defined configuration. Production work should occur on compute node(s), not on the front end.
Once your project is ready for production, your strategy becomes either compiling the MATLAB code into an executable file or using the Coder to generate standalone C and C++ code from MATLAB® code. The generated source code is portable and readable. The MATLAB Compiler license is a lingering license. Using the compiler locks its license for at least 30 minutes. Each time you issue the compiler command, you reset the 30-minute timer. When running mcc at the MATLAB prompt, you hold the license as long as MATLAB remains open. To release the license, quit MATLAB. Quitting MATLAB is necessary to release the license, but not sufficient. When you quit MATLAB before the 30-minute timer runs out, then the license remains checked out for the remaining time. When running mcc at the Linux prompt, the 30-minute timer starts or restarts. The license manager tracks MATLAB licenses per user per cluster. If you run the MATLAB Compiler on two different clusters, you use two Compiler licenses. To minimize your license usage, run the Compiler on on cluster.
The following sections provide several examples illustrating how to submit MATLAB jobs to a Linux compute cluster. They also explain the kind and quantity of MATLAB licenses for each method. When developing your application, use a method that submits your MATLAB to a compute node while using MATLAB's 'local' configuration. This avoids competing for the limited number of DCS licenses. When running your application in production mode, use the Compiler or Coder, use compute nodes, not the front end, and use the minimal number of MATLAB licenses possible.
Finally, MATLAB offers implicit parallelism in the form of thread-parallel enabled functions. This is different from the explicit parallelism of the Parallel Computing Toolbox. When you know that you are developing a serial job and you are unsure whether you are calling one of MATLAB's thread-parallel enabled functions, run MATLAB with implicit parallelism turned off: -singleCompThread. When you know that you want to use a thread-parallel enabled function for its parallelism, request exclusive use of a node by setting ppn= to the number of processor cores physically available on the compute node of a cluster.
For more information about MATLAB:
Use the Configuration Manager in the Parallel menu to prepare your PBS configuration. This configuration contains the PBS details (queue, nodes, ppn, walltime, etc.) of your job submission. Ultimately, your PBS configuration will be an argument of the MATLAB functions batch() or FindResource(). Alternatively, you can make your PBS configuration the default configuration which function batch() reads during job submission. If you have several applications and each requires different PBS command-line options, then each application may have its own configuration. To make your PBS configuration, load a MATLAB module on a front end and verify the version of the MATLAB module loaded. Run a MATLAB client with the desktop showing. First, discover the current list of configurations; most likely, just 'local'. Then select the Parallel menu:
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay -singleCompThread
>> [current all] = defaultParallelConfig
>> disp(current)
local
>> disp(all)
'local'
>>
Parallel
Manage Configurations
In the Configurations Manager dialog box, select File:
File New torque
Enter properties as needed. Here are a few suggestions.
Enter a Configuration name:
mypbsconfig
In the Jobs tab, enter an appropriate value for the minimum and maximum number of workers.
ClusterMatlabRoot (use the path of the chosen version of MATLAB):
apps/rhel5/MATLAB_R2010a apps/rhel5/MATLAB/R2010b apps/rhel5/MATLAB/R2011b
ClusterSize (the number of DCS licenses available):
128
ResourceTemplate (PBS command-line options):
-l nodes=^N^
SubmitArguments (PBS command-line options):
-q myqueuename -l walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+^N^
ClusterOsType:
unix
HasSharedFileSystem:
True
RshCommand:
ssh
In the Jobs tab, enter an appropriate value for the minimum and maximum number of workers.
Export your PBS configuration:
OK Right-click the name of the PBS configuration Export File New Save
The MATLAB interpreter is the part of MATLAB which reads M-files and MEX-files and executes MATLAB statements.
This section illustrates five methods about submitting a small, serial, MATLAB program as a batch job to a PBS queue. This MATLAB program prints the name of the run host and gets the three random numbers. The system function hostname returns two values: a code and the run host name.
The first method runs on a front end a MATLAB client which runs the MATLAB batch() function with a PBS configuration. Function batch() is a wrapper for function submit(). Since it uses a PBS configuration, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out three licenses: one MATLAB license for the client running on the front end, one PCT license, and one DCS license. The MATLAB license remains active between running and quitting MATLAB. The PCT license remains active between running a PCT function, such as batch(), and quitting MATLAB. The DCS license remains active between running function batch() and job completion. The DCS license does not appear in the output of function license().

The second method uses the PBS qsub command to submit to a compute node a MATLAB client which calls the MATLAB batch() function with the 'local' configuration. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as batch(), and quitting MATLAB. This job is completely off the front end.

The third method runs on a front end a MATLAB client which runs the MATLAB submit() function with the 'torque' scheduler. When you need more granularity in your submission, use submit(). Since it uses the 'torque' scheduler, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out three licenses: one MATLAB license for the client running on the front end, one PCT license, and one DCS license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. The DCS license remains active between running function submit() and job completion. The DCS license does not appear in the output of function license().
The fourth method uses the PBS qsub command to submit to a compute node a MATLAB client which runs the MATLAB submit() function with the 'local' scheduler. When you need more granularity in your submission, use submit(). Since it uses the 'local' scheduler, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. This job is completely off the front end.
The fifth method uses the PBS qsub command to submit to a compute node a MATLAB client which interprets an M-file with the 'local' configuration. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter; so, it requires and checks out one license: one MATLAB license for the client running on a compute node. The MATLAB license remains active between starting and quitting MATLAB. This job is completely off the front end.
The following table summarizes MATLAB license usage:
| Method | Description | MATLAB | PCT | DCS | mcc | Limitations |
|---|---|---|---|---|---|---|
| 1 | batch() with user-defined PBS configuration | 1 | 1 | 1 | 0 | number of MATLAB,PCT,DCS licenses purchased |
| 2 | batch() with 'local' configuration, qsub | 1 | 1 | 0 | 0 | local configuration with 8 (R2009a) and 12 (R2011a) workers |
| 3 | submit() with 'torque' scheduler | 1 | 1 | 1 | 0 | number of MATLAB,PCT,DCS licenses purchased |
| 4 | submit() with 'local' scheduler, qsub | 1 | 1 | 0 | 0 | local scheduler with 8 (R2009a) and 12 (R2011a) workers |
| 5 | qsub | 1 | 0 | 0 | 0 | number of MATLAB licenses purchased |
Prepare a MATLAB serial program in the form of a MATLAB script M-file and a MATLAB function M-file with appropriate filenames, here named myscript.m and myfunction.m:
% FILENAME: myscript.m
% Display name of compute node which ran this job.
[c name] = system('hostname');
fprintf('\n\nhostname:%s\n', name)
% Display three random numbers.
A = rand(1,3);
fprintf('%f %f %f\n', A);
% FILENAME: myfunction.m
function result = myfunction ()
% Return name of compute node which ran this job.
[c name] = system('hostname');
result = sprintf('hostname:%s', name);
% Return three random numbers.
A = rand(1,3);
r = sprintf('%f %f %f', A);
result=strvcat(result,r);
end
The function M-file returns a single value: a concatenation of the name of the compute node which runs the function and the three random numbers.
For the first method of job submission, use a MATLAB M-file (MATLAB function batch() accepts either a script M-file or a function M-file).
At the MATLAB prompt, discover which MATLAB licenses are in use. View the current default parallel configuration; most likely, it is the 'local' configuration. Call MATLAB function batch() to run the MATLAB code in the file myscript.m with your PBS configuration which you previously designed with the Configuration Manager (using the 'local' configuration at this point will run the MATLAB program on the front end). Also, capture job output in the diary. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). While your job is running, get a list of the licenses in use. The list includes the MATLAB license and the PCT license. The DCS license of the worker does not show. After your job finishes, verify that the PCT license remains in use. View results by either viewing the diary, loading the job, or getting all output arguments into a cell array. When you do not want to keep the files of your job, use MATLAB function destroy() to erase them from your current working directory. After this job finishes, the default parallel configuration remains the 'local' configuration.
>> license('inuse')
matlab
>> disp(defaultParallelConfig);
local
>> job = batch('myscript','Configuration','mypbsconfig','CaptureDiary',true);
>> !qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
98237.radon-ad myusername standby Job1Task1 -- 1 1 -- 00:01 Q --
>>
>> disp(job.get('State'))
queued
>> disp(job.get('State'))
running
>> license('inuse')
distrib_computing_toolbox
matlab
>> disp(job.get('State'))
finished
>> license('inuse')
distrib_computing_toolbox
matlab
>> job.diary
hostname:radon-a000.rcac.purdue.edu
0.9173 0.6839 0.8661
>> who
Your variables are:
ans job
>> job.load
>> who
Your variables are:
A ans c job name
>> disp(name)
hostname:radon-a000.rcac.purdue.edu
>> disp(A)
0.9173 0.6839 0.8661
>> result = getAllOutputArguments(job);
>> result{1}
ans =
A: [0.9173 0.6839 0.8661]
ans: 'local'
c: 0
name: hostname:radon-a000.rcac.purdue.edu
>> disp(result{1}.name)
hostname:radon-a000.rcac.purdue.edu
>> disp(result{1}.A)
0.9173 0.6839 0.8661
>> ls -l
>> job.destroy;
>> ls -l
>> disp(defaultParallelConfig);
local
>> quit;
$
The license name distrib_computing_toolbox refers to the Parallel Computing Toolbox.
Function qstat shows that the MATLAB client submitted this job as one compute node (NDS) with one processor core (TSK) and with the wall time of one minute.
Output demonstrates three ways to access the results: diary, load, and getAllOutputArguments(). Output shows the name of the compute node (a000) which processed the file myscript.m and three random numbers.
After calling the MATLAB function batch(), you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with the PBS command qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy(), so you may rerun MATLAB and find your job:
$ module load matlab/R2011b
$ matlab -nodisplay -singleCompThread
>> sched=findResource('scheduler','Configuration','mypbsconfig');
>> job=findJob(sched,'State','finished');
>> job.diary
>> job.load
>> name
>> A
>> result = getAllOutputArguments(job);
>> result{1}.name
>> result{1}.A
>> destroy(job);
>> quit
$
To apply the first method of job submission to a function M-file, use one of the following sequences:
>> job=batch('myfunction','Configuration','mypbsconfig','CaptureDiary',true);
>> disp(job.get('State'))
finished
>> job.diary
>> job.load
>> ans
>> result=getAllOutputArguments(job);
>> result{1}.ans
>> job=batch('myfunction',1,{},'Configuration','mypbsconfig');
>> disp(job.get('State'))
finished
>> result=getAllOutputArguments(job);
>> result{1}
>> job=batch(@myfunction,1,{},'Configuration','mypbsconfig');
>> disp(job.get('State'))
finished
>> result=getAllOutputArguments(job);
>> result{1}
Function batch() offers an alternate mode of operation. If you exclude the property Configuration and its value from the list of arguments of function batch(), then batch() reads the default parallel configuration. Before calling batch(), set the default parallel configuration with your PBS configuration which you previously designed with the Configuration Manager. This change of the default parallel configuration is immediate and permanent; the PBS configuration remains the default parallel configuration during subsequent runs of MATLAB. MATLAB function batch() reads the default parallel configuration to discover your instructions about submission.
To scale up this method to handle a real application, use the Configuration Manager in the Parallel menu to increase the wall time to accommodate a longer running job. Enter a new wall time in the property SubmitArguments.
The second method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar to the first method since it uses the same MATLAB M-file, either myscript.m or myfunction.m. This method differs from the first because the MATLAB client runs on a compute node rather than on the front end, and the client uses the MATLAB 'local' configuration rather than a user-defined PBS configuration.
Prepare a MATLAB script M-file that calls MATLAB function batch() which specifies the 'local' configuration and which captures job output in the diary. Use an appropriate filename, here named mylclbatch.m:
% FILENAME: mylclbatch.m
!echo "mylclbatch.m"
!hostname
job=batch('myscript','Configuration','local','CaptureDiary',true);
job.wait;
job.diary
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -singleCompThread -r mylclbatch
Submit the job as a single compute node with two processor cores and request one PCT license:
$ qsub -l nodes=1:ppn=2,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
One processor core runs myjob.sub and mylclbatch.m; one processor core runs the MATLAB M-file.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
99025.radon-ad myusername standby myjob.sub 30197 1 2 -- 00:01 R 00:00
Output shows two processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a639.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
mylclbatch.m
radon-a639.rcac.purdue.edu
hostname:radon-a639.rcac.purdue.edu
0.917276 0.683883 0.866076
Output shows that processor cores on one compute node (a639) processed the entire job. One processor core processed myjob.sub and mylclbatch.m. One processor core processed the batch script myscript.m. Output also displays three random numbers.
Any output written to standard error will appear in myjob.sub.emyjobid.
To apply the second method of job submission to a function M-file, modify mylclbatch.m with one of the following sequences:
job=batch('myfunction','Configuration','local','CaptureDiary',true);
job.wait;
job.diary
job=batch('myfunction',1,{},'Configuration','local');
job.wait;
result = getAllOutputArguments(job);
result{1}
job=batch(@myfunction,1,{},'Configuration','local');
job.wait;
result = getAllOutputArguments(job);
result{1}
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job.
For the third method of job submission, use the MATLAB function M-file myfunction.m (MATLAB function submit() accepts only a function M-file).
Prepare a MATLAB script M-file which finds the MATLAB 'torque' scheduler, which specifies one minute of wall time for this short job and the number of PCT and DCS licenses, which defines a MATLAB serial job and one task, and which calls MATLAB function submit(). Use an appropriate filename, here named mypbssubmit.m:
% FILENAME: mypbssubmit.m
sched = findResource('scheduler', 'type', 'torque');
set(sched,'ClusterMatlabRoot',matlabroot);
set(sched,'SubmitArguments','-l walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+1');
job = createJob(sched);
set(job,'FileDependencies',{'myfunction.m'});
task = createTask(job,@myfunction,1,{});
submit(job);
disp('FINISHED SUBMITTING')
On a front end, load a MATLAB module and verify the version of the MATLAB module loaded. Run a MATLAB client. At the MATLAB prompt, run mypbssubmit.m. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). After your job finishes, get all output arguments into a cell array and display the results. When you do not want to keep the files of a job, use MATLAB function destroy() to erase them from your current working directory.
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay -singleCompThread
>> mypbssubmit
FINISHED SUBMITTING
>> !qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
98238.radon-ad myusername standby Job2Task1 26391 1 1 -- 00:01 Q 00:00
>> disp(job.get('State'))
queued
>> disp(job.get('State'))
running
>> disp(job.get('State'))
finished
>> result = getAllOutputArguments(job)
result =
[2x37 char]
>> result{1}
ans =
hostname:radon-a639.rcac.purdue.edu
0.917276 0.683883 0.866076
>> ls -l
>> job.destroy;
>> ls -l
>> quit
$
Function qstat shows that the MATLAB client submitted this job as one compute node (NDS) with one processor core (TSK) and the requested wall time of one minute.
Output shows the name of the compute node (a639) which processed the file myfunction.m. Output also displays the three random numbers.
After the "FINISHED SUBMITTING" message, you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with PBS command qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy(), so you may rerun MATLAB and find your job:
$ module load matlab/R2011b
$ matlab -nodisplay -singleCompThread
>> sched=findResource('scheduler','type','torque');
>> job=findJob(sched,'State','finished');
>> result = getAllOutputArguments(job);
>> result{1}
>> job.destroy;
>> quit
$
Function submit() offers an alternate mode of operation. It can read your PBS configuration made previously with the Configuration Manager. Here is an alternate version of mypbssubmit.m which relies on mypbsconfig for the several details of job submission:
% FILENAME: mypbssubmit.m
sched = findResource('scheduler','configuration','mypbsconfig');
job = createJob(sched);
set(job,'FileDependencies',{'myfunction.m'});
task = createTask(job,@myfunction,1,{});
submit(job);
disp('FINISHED SUBMITTING');
To scale up this method to handle a real application, increase the wall time in mypbssubmit.m to accommodate a longer running job.
The fourth method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar the third method since it uses the same MATLAB function M-file myfunction.m (MATLAB function submit() accepts only a function M-file). This method differs from the third because the MATLAB client runs on a compute node rather than on the front end, and the client uses the MATLAB 'local' scheduler rather than the MATLAB 'torque' scheduler.
Prepare a MATLAB script M-file which finds the MATLAB 'local' scheduler, which defines a MATLAB serial job and one task, and which calls MATLAB function submit(). Use an appropriate filename, here named mylclsubmit.m:
% FILENAME: mylclsubmit.m
!echo "mylclsubmit.m"
!hostname
sched = findResource('scheduler', 'type', 'local');
set(sched,'ClusterMatlabRoot',matlabroot);
job = createJob(sched);
set(job,'FileDependencies',{'myfunction.m'});
task = createTask(job,@myfunction,1,{});
submit(job);
disp('FINISHED SUBMITTING')
job.wait;
result = getAllOutputArguments(job);
result{1}
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -singleCompThread -r mylclsubmit
Submit the job as a single node with two processor cores and request one PCT license:
$ qsub -l nodes=1:ppn=2,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
One processor core runs myjob.sub and mylclsubmit.m; one processor core runs the MATLAB function.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
97986.radon-ad myusername standby myjob.sub 4645 1 2 -- 00:01 R 00:00
Output shows two processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a639.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
mylclsubmit.m
radon-a639.rcac.purdue.edu
FINISHED SUBMITTING
ans =
radon-a639.rcac.purdue.edu
0.917276 0.683883 0.866076
Output shows that processor cores on one compute node (a639) processed the entire job. One processor core processed myjob.sub and mylclbatch.m. One processor core processed the batch script myfunction.m. Output also displays three random numbers.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job.
The fifth method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar to the second and fourth methods since it uses either a MATLAB script M-file or a MATLAB function M-file. Like the second and fourth methods, this method runs the MATLAB client on a compute node rather than on the front end. This places the 'local' configuration on the compute node, rather than on the front end. This allows using the 'local' configuration rather than a user-defined configuration to run a MATLAB program on a compute node. What is different is that the MATLAB script M-file must quit; the function M-file requires no change.
Modify the MATLAB script M-file myscript.m with a quit statement. The MATLAB function M-file myfunction.m needs no change:
% FILENAME: myscript.m
% Display name of compute node which ran this job.
[c name] = system('hostname');
fprintf('\n\nhostname:%s\n', name);
% Display three random numbers.
A = rand(1,3);
fprintf('%f %f %f\n', A);
quit;
% FILENAME: myfunction.m
function result = myfunction ()
% Return name of compute node which ran this job.
[c name] = system('hostname');
result = sprintf('hostname:%s', name);
% Return three random numbers.
A = rand(1,3);
r = sprintf('%f %f %f', A);
result=strvcat(result,r);
end
Prepare a job submission file with an appropriate filename, here named myjob.sub. Run with the name of either the script M-file or the function M-file:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -singleCompThread -r myscript # matlab -nodisplay -singleCompThread -r myfunction
OR:
#!/bin/sh -l
# FILENAME: myjob.sub
echo "myjob.sub"
hostname
module load matlab/R2011b
unset DISPLAY
# -nodisplay: run MATLAB in text mode; X11 server not needed
# -singleCompThread: turn off implicit parallelism
matlab -nodisplay -singleCompThread << EOF
% Display name of compute node which ran this job.
[c name] = system('hostname');
fprintf('\n\nhostname:%s\n', name)
% Display three random numbers.
A = rand(1,3);
fprintf('%f %f %f\n', A);
quit;
EOF % end of MATLAB code
Submit the job as a single compute node with one processor core:
$ qsub -l nodes=1:ppn=1,walltime=00:01:00 myjob.sub
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
97986.radon-ad myusername standby myjob.sub 4645 1 1 -- 00:01 R 00:00
Output shows one compute node (NDS) with one processor core (TSK).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a639.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
hostname:radon-a639.rcac.purdue.edu
0.814724 0.905792 0.126987
Output shows that a processor core on one compute node (a639) processed the entire job. One processor core processed myjob.sub and myscript.m. Output also displays the three random numbers.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job.
For more information about MATLAB:
The MATLAB Compiler translates an M-file into a standalone application or software component. A compiled version of an M-file can substantially improve performance of MATLAB code, especially for statements like for and while. The MATLAB Compiler Runtime (MCR) is a standalone set of shared libraries. Together, compiling and the MCR enable the execution of MATLAB files, even outside the MATLAB environment. While you do need to purchase a MATLAB Compiler license to build an executable, you may freely distribute the executable and the MCR to as many colleagues and computers as desired without license restrictions.
This section illustrates the sixth method about submitting a small, serial, MATLAB program as a batch job to a PBS queue. This MATLAB program prints the name of the run host and computes the inverse of a matrix. The system function hostname returns two values: a code and the run host name.
The sixth method of job submission uses the MATLAB Compiler mcc to compile a MATLAB M-file and submits the compiled file to a PBS queue. It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. During compilation, the default configuration may be either the 'local' configuration or your PBS configuration; the results will be the same. This job is completely off the front end.
The MATLAB Compiler license is a lingering license. Using the compiler locks its license for at least 30 minutes. Each time you issue the compiler command, you reset the 30-minute timer. When running mcc at the MATLAB prompt, you hold the license as long as MATLAB remains open. To release the license, quit MATLAB. Quitting MATLAB is necessary to release the license, but not sufficient. When you quit MATLAB before the 30-minute timer runs out, then the license remains checked out for the remaining time. When running mcc at the Linux prompt, the 30-minute timer starts or restarts. The license manager tracks MATLAB licenses per user per cluster. If you run the MATLAB Compiler on two different clusters, you use two Compiler licenses. To minimize your license usage, run the Compiler on one cluster.
Unlike compilers of typical programming languages like C, C++, and Fortran, the MATLAB Compiler does not generate machine executable code. Instead, it encrypts MATLAB code so that it cannot be viewed or modified. It also applies a wrapper around the code. Including the MCR in the compilation makes a GUI-less, standalone application which you may distribute royalty-free. You can share your compiled program with colleagues who have neither MATLAB licenses nor the MCR. Your compiled program will take its required licenses within itself. So, you do not need to install any license file on your colleague's machine.
The following table summarizes MATLAB license usage:
| Method | MATLAB | PCT | DCS | mcc |
|---|---|---|---|---|
| Run within MATLAB | 1 | 0 | 0 | 1 |
| Run without MATLAB | 0 | 0 | 0 | 1 |
Prepare either a MATLAB script M-file or a MATLAB function M-file. The method described below works for both.
The MATLAB script M-file includes the MATLAB statement quit to ensure that the compiled program terminates. Use an appropriate filename, here named myscript.m:
% FILENAME: myscript.m
% Display name of compute node which ran this job.
[c name] = system('hostname');
fprintf('\n\nhostname:%s\n', name)
% Display three random numbers.
A = rand(1,3);
fprintf('%f %f %f\n', A);
quit;
The MATLAB function M-file has the usual function and end statements. Use an appropriate filename, here named myfunction.m:
% FILENAME: myfunction.m
function result = myfunction ()
% Return name of compute node which ran this job.
[c name] = system('hostname');
result = sprintf('hostname:%s', name);
% Return three random numbers.
A = rand(1,3);
r = sprintf('%f %f %f', A);
result=strvcat(result,r);
end
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_myscript.sh /apps/rhel5/MATLAB/R2011b
On a front end, load modules for MATLAB and GCC and verify the versions loaded. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC 4.6.2 is available on Radon. Compile the MATLAB script M-file:
$ module load matlab/R2011b $ module load gcc/4.6.2 $ module list 1) matlab/R2011b 2) gcc/4.6.2 $ mcc -m mywrapper.m myscript.m
A few new files appear after the compilation:
mccExcludedFiles.log myscript myscript.prj myscript_main.c myscript_mcc_component_data.c readme.txt run_myscript.sh
The name of the stand-alone executable file is myscript. The name of the shell script to run this executable file is run_myscript.sh.
To obtain the name of the compute node which runs this compiler-generated script run_myscript.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows:
#!/bin/sh # script for execution of deployed applications # # Sets up the MCR environment for the current $ARCH and executes # the specified command. # exe_name=$0 exe_dir=`dirname "$0"` echo "run_myscript.sh" hostname echo "------------------------------------------" if [ "x$1" = "x" ]; then echo Usage: echo $0 \args else echo Setting up environment variables MCRROOT="$1" echo --- LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64; MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE} ; XAPPLRESDIR=${MCRROOT}/X11/app-defaults ; export LD_LIBRARY_PATH; export XAPPLRESDIR; echo LD_LIBRARY_PATH is ${LD_LIBRARY_PATH}; shift 1 "${exe_dir}"/myscript $* fi exit
Submit the job:
$ qsub -l nodes=1,walltime=00:01:00 myjob.sub
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
378428.radon-ad kes workq myjob.sub 18964 1 1 -- 00:01 R 00:00
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. myjob.sub radon-a637.rcac.purdue.edu run_myscript.sh radon-a637.rcac.purdue.edu ------------------------------------------ Setting up environment variables --- LD_LIBRARY_PATH is .:/apps/rhel5/MATLAB_R2010a/runtime/glnxa64:/apps/rhel5/MATLAB_R2010a/bin/glnxa64:/apps/rhel5/MATLAB_R2010a/sys/os/glnxa 64:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64 /server:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/client:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64 Warning: No display specified. You will not be able to display graphics on the screen. hostname:radon-a637.rcac.purdue.edu 0.814724 0.905792 0.126987
Output shows the name of the compute node that ran the job submission file myjob.sub, the name of the compute node that ran the compiler-generated script run_myscript.sh, and the name of the compute node that ran the serial job: a637 in all three cases. Output also shows the three random numbers.
Any output written to standard error will appear in myjob.sub.emyjobid.
To apply this method of job submission to a MATLAB function M-file, prepare a wrapper function which receives and displays the result of myfunction.m. Use an appropriate filename, here named mywrapper.m:
# FILENAME: mywrapper.m result = myfunction(); disp(result) quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_mywrapper.sh /apps/rhel5/MATLAB/R2011b
Compile both the wrapper and the function then submit:
$ mcc -m mywrapper.m myfunction.m $ qsub -l nodes=1,walltime=00:01:00 myjob.sub
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job.
For more information about the MATLAB Compiler:
MEX stands for MATLAB Executable. A MEX-file offers an interface which allows MATLAB code to call functions written in C, C++, or Fortran as though these external functions were built-in MATLAB functions. MATLAB also offers external interface functions that facilitate the transfer of data between MEX-files and MATLAB. A MEX-file usually starts by transferring data from MATLAB to the MEX-file; then it processes the data with the user-written code; and finally, it transfers the results back to MATLAB. This feature involves compiling then dynamically linking the MEX-file to the MATLAB program. You may wish to use a MEX-file if you would like to call an existing C, C++, or Fortran function directly from MATLAB rather than reimplementing that code as a MATLAB function. Also, by implementing performance-critical routines in C, C++, or Fortran rather than MATLAB, you may be able to substantially improve performance over MATLAB source code, especially for statements like for and while. Areas of application include legacy code written in C, C++, or Fortran.
This section illustrates how to use the PBS qsub command to submit a small MATLAB job with a MEX-file to a PBS queue.
The first MEX example calls a C function which employs serial code to add two matrices. This example, when executed, uses the MATLAB interpreter, so it requires and checks out a MATLAB license.
The second MEX example calls a C function which employs MPI to distribute the work of a message-passing program among several compute nodes. This example, when executed, uses the MATLAB interpreter, so it requires and checks out a MATLAB license. This example avoids using a PCT license.
The third MEX example calls a C function which employs OpenMP to distribute the work of a shared-memory program (parallel for loop) among several threads. This example, when executed, uses the MATLAB interpreter, so it requires and checks out a MATLAB license. This example avoids using a PCT license.
The fourth MEX example calls a C function which employs both MPI and OpenMP to distribute the work of a hybrid program across compute nodes and across processor cores within each compute node. This example, when executed, uses the MATLAB interpreter, so it requires and checks out a MATLAB license. This example avoids using a PCT license.
For the first example, prepare a complicated and time-consuming computation in the form of a C, C++, or Fortran function. In this example, the computation is a C function which adds two matrices:
/* Computational Routine */
void matrixSum (double *a, double *b, double *c, int n) {
int i;
/* Matrix (component-wise) addition. */
for (i = 0; i<n; i++) {
c[i] = a[i] + b[i];
}
}
Combine the computational routine with a MEX-file, which contains the necessary external function interface of MATLAB. In the computational routine, change int to mwSize. Use an appropriate filename, here named matrixSum.c:
/***********************************************************
* FILENAME: matrixSum.c
*
* Adds two MxN arrays (inMatrix).
* Outputs one MxN array (outMatrix).
*
* The calling syntax is:
*
* matrixSum (inMatrix, inMatrix, outMatrix, size)
*
* This is a MEX-file for MATLAB.
*
**********************************************************/
#include "mex.h"
/* Computational Routine */
void matrixSum (double *a, double *b, double *c, mwSize n) {
mwSize i;
/* Component-wise addition. */
for (i = 0; i<n; i++) {
c[i] = a[i] + b[i];
}
}
/* Gateway Function */
void mexFunction (int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[]) {
double *inMatrix_a; /* mxn input matrix */
double *inMatrix_b; /* mxn input matrix */
mwSize nrows_a,ncols_a; /* size of matrix a */
mwSize nrows_b,ncols_b; /* size of matrix b */
double *outMatrix_c; /* mxn output matrix */
/* Check for proper number of arguments. */
if(nrhs!=2) {
mexErrMsgIdAndTxt("MyToolbox:matrixSum:nrhs","Two inputs required.");
}
if(nlhs!=1) {
mexErrMsgIdAndTxt("MyToolbox:matrixSum:nlhs","One output required.");
}
/* Get dimensions of the first input matrix. */
nrows_a = mxGetM(prhs[0]);
ncols_a = mxGetN(prhs[0]);
/* Get dimensions of the second input matrix. */
nrows_b = mxGetM(prhs[1]);
ncols_b = mxGetN(prhs[1]);
/* Check for equal number of rows. */
if(nrows_a != nrows_b) {
mexErrMsgIdAndTxt("MyToolbox:matrixSum:notEqual","Unequal number of rows.");
}
/* Check for equal number of columns. */
if(ncols_a != ncols_b) {
mexErrMsgIdAndTxt("MyToolbox:matrixSum:notEqual","Unequal number of columns.");
}
/* Make a pointer to the real data in the first input matrix. */
inMatrix_a = mxGetPr(prhs[0]);
/* Make a pointer to the real data in the second input matrix. */
inMatrix_b = mxGetPr(prhs[1]);
/* Make the output matrix. */
plhs[0] = mxCreateDoubleMatrix(nrows_a,ncols_a,mxREAL);
/* Make a pointer to the real data in the output matrix. */
outMatrix_c = mxGetPr(plhs[0]);
/* Call the computational routine. */
matrixSum(inMatrix_a,inMatrix_b,outMatrix_c,nrows_a*ncols_a);
}
Prepare a MATLAB script M-file with an appropriate filename, here named myscript.m:
% FILENAME: myscript.m
% Display the name of the compute node which runs this script.
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('myscript.m: hostname:%s\n', name)
% Call the separately compiled and dynamically linked MEX-file.
A = [1,1,1;1,1,1]
B = [2,2,2;2,2,2]
C = matrixSum(A,B)
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -singleCompThread -r myscript
To access the MATLAB utility mex, load a MATLAB module. mex depends on shared libraries from GCC Version 4.3.x. This version is not available on Radon, but try the default GCC (MATLAB MEX does not support GCC 4.6 and up). Compile matrixSum.c into a MATLAB-callable MEX-file:
$ module load matlab/R2011b $ module load gcc/4.6.2 $ mex matrixSum.c
The name of the MATLAB-callable MEX-file is matrixSum.mexa64. If you see the following warning, ignore it:
Warning: You are using gcc version "4.6.2". The version
currently supported with MEX is "4.3.4".
For a list of currently supported compilers see:
http://www.mathworks.com/support/compilers/current_release/
Submit the job:
$ qsub -l nodes=1,walltime=00:01:00 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a148.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
myscript.m: hostname:radon-a148.rcac.purdue.edu
A =
1 1 1
1 1 1
B =
2 2 2
2 2 2
C =
3 3 3
3 3 3
Output shows the name of the compute node (a148) which processed this serial job. Also, this job shared the compute node with other jobs.
Any output written to standard error will appear in myjob.sub.emyjobid.
Rerun this serial job so that it has exclusive access to its compute node:
qsub -l nodes=1:ppn=8,walltime=00:01:00 myjob.sub
For the second example, prepare a MEX file with a function containing MPI function calls. Use an appropriate filename, here named mex_mpi.c:
/* FILENAME: mex_mpi.c */
#include "mex.h"
#include <stdio.h>
#include <mpi.h>
void f () {
/* MPI Parameters */
int rank, size, len;
char name[MPI_MAX_PROCESSOR_NAME];
/* All ranks initiate the message-passing environment. */
/* Each rank obtains information about itself and its environment. */
MPI_Init(/*&argc, &argv*/ 0,0); /* start MPI */
MPI_Comm_size(MPI_COMM_WORLD, &size); /* get number of ranks */
MPI_Comm_rank(MPI_COMM_WORLD, &rank); /* get rank */
MPI_Get_processor_name(name, &len); /* get run-host name */
printf("Runhost:%s Rank:%d of %d ranks hello, world\n", name,rank,size);
MPI_Finalize(); /* terminate MPI */
return;
}
void mexFunction(int nlhs,mxArray* plhs[], int nrhs, const mxArray* prhs[])
{
/* Check for proper number of arguments. */
if(nrhs!=0) {
mexErrMsgTxt("Zero input required.");
} else if(nlhs>0) {
mexErrMsgTxt("Too many output arguments.");
}
/* Display the name of the compute node. */
f();
}
Prepare a MATLAB script M-file with an appropriate filename, here named myscript.m:
% FILENAME: myscript.m
% Display the name of the compute node which runs this script.
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('myscript.m: hostname:%s\n', name)
% Call the separately compiled and dynamically linked MEX-file.
% Display the names of compute nodes which run the MPI ranks.
mex_mpi();
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b module load mvapich2/1.7_gcc-4.4.5 cd $PBS_O_WORKDIR unset DISPLAY # -n: 4 MPI ranks # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator mpiexec -n 4 matlab -nodisplay -singleCompThread -r myscript
To access the MATLAB utility mex, load a MATLAB module. mex depends on shared libraries from GCC Version 4.3.x. This version is not available on Radon, but try the default GCC (MATLAB MEX does not support GCC 4.6 and up). Load GCC Version 4.4.5 with a recent version of MPI-2. Compile the C program into a MATLAB-callable MEX-file:
$ module load matlab/R2011b $ module load mvapich2/1.7_gcc-4.4.5 $ mex mex_mpi.c CC="mpicc"
The name of the MATLAB-callable MEX-file is mex_mpi.mexa64.
Submit the job while requesting four compute nodes, each with one processor core and one MPI rank:
$ qsub -l nodes=4:ppn=1,walltime=00:01:00 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a148.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
myscript.m: hostname:radon-a148.rcac.purdue.edu
myscript.m: hostname:radon-a158.rcac.purdue.edu
myscript.m: hostname:radon-a159.rcac.purdue.edu
myscript.m: hostname:radon-a160.rcac.purdue.edu
Runhost:radon-a148.rcac.purdue.edu Rank:0 of 4 ranks hello, world
Runhost:radon-a159.rcac.purdue.edu Rank:2 of 4 ranks hello, world
Runhost:radon-a158.rcac.purdue.edu Rank:1 of 4 ranks hello, world
Runhost:radon-a160.rcac.purdue.edu Rank:3 of 4 ranks hello, world
Output shows the names of the compute nodes (a148,a158,a159,a160) which processed this MPI job. The MPI ranks resided on different compute nodes. Also, this job shared its compute nodes with other jobs.
Any output written to standard error will appear in myjob.sub.emyjobid.
Rerun this MPI job so that each rank has exclusive access to its compute node:
% FILENAME: myscript.m
% Display the name of the compute node which runs this script.
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('myscript.m: hostname:%s\n', name)
% Call the separately compiled and dynamically linked MEX-file.
% Display the names of compute nodes which run the MPI ranks.
mex_mpi();
quit;
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b module load mvapich2/1.7_gcc-4.4.5 cd $PBS_O_WORKDIR unset DISPLAY uniq <$PBS_NODEFILE >nodefile # -n: 4 MPI ranks # -machinefile: alternate source for compute node names # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator mpiexec -n 4 -machinefile nodefile matlab -nodisplay -singleCompThread -r myscript
qsub -l nodes=4:ppn=8,walltime=00:01:00 myjob.sub
myjob.sub radon-a002.rcac.purdue.edu myscript.m: hostname:radon-a002.rcac.purdue.edu myscript.m: hostname:radon-a003.rcac.purdue.edu myscript.m: hostname:radon-a005.rcac.purdue.edu myscript.m: hostname:radon-a004.rcac.purdue.edu Runhost:radon-a002.rcac.purdue.edu Rank:0 of 4 ranks hello, world Runhost:radon-a004.rcac.purdue.edu Rank:2 of 4 ranks hello, world Runhost:radon-a003.rcac.purdue.edu Rank:1 of 4 ranks hello, world Runhost:radon-a005.rcac.purdue.edu Rank:3 of 4 ranks hello, world
Output shows that each MPI rank resides on a different compute node. Each rank has exclusive access to its compute node.
For the third example, prepare a MEX file with a function containing OpenMP directives and function calls. Use an appropriate filename, here named mex_openmp.c:
/* FILENAME: mex_openmp.c */
#include "mex.h"
#include <stdio.h>
#include <omp.h>
void f () {
/* SERIAL REGION (master thread) */
/* Parameters of the Application */
int len=30;
char name[30]; /* run-host name */
int i; /* loop control variable */
/* OpenMP Parameters */
int id, nthreads;
/* Master thread obtains information about itself and its environment. */
nthreads = omp_get_num_threads(); /* get number of threads */
id = omp_get_thread_num(); /* get thread ID */
gethostname(name,len); /* get run-host name */
printf("SERIAL REGION: Runhost:%s Thread:%d of %d thread hello, world\n", name,id,nthreads);
/* Open parallel region. */
#pragma omp parallel shared(nthreads)
{nthreads = omp_get_num_threads(); /* get number of threads */
} /* store value in shared nthreads of serial region */
/* printf("nthreads = %d\n", nthreads); */
/* PARALLEL REGION */
#pragma omp parallel for private(name,id) firstprivate(nthreads)
for (i=0; i<2*nthreads; i++) {
nthreads = omp_get_num_threads(); /* get number of threads */
id = omp_get_thread_num(); /* get thread ID */
gethostname(name,len); /* get run-host name */
printf("PARALLEL LOOP: Runhost:%s Thread:%d of %d threads Iteration:%2d hello, world\n", name,id,nthreads,i);
} /* lexical extent of loop-level parallelism */
/* SERIAL REGION (master thread) */
nthreads = omp_get_num_threads(); /* get number of threads */
printf("SERIAL REGION: Runhost:%s Thread:%d of %d thread hello, world\n", name,id,nthreads);
return;
}
void mexFunction(int nlhs,mxArray* plhs[], int nrhs, const mxArray* prhs[])
{
/* Check for proper number of arguments. */
if(nrhs!=0) {
mexErrMsgTxt("Zero input required.");
} else if(nlhs>0) {
mexErrMsgTxt("Too many output arguments.");
}
/* Display the name of the compute node. */
/* Display the iterations which each thread processes. */
f();
}
Prepare a MATLAB script M-file with an appropriate filename, here named myscript.m:
% FILENAME: myscript.m
% Display the name of the compute node which runs this script.
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('myscript.m: hostname:%s\n', name)
% Call the separately compiled and dynamically linked MEX-file.
% Display the name of the compute node which runs the OpenMP threads.
% Display the iterations which each thread processes.
mex_openmp();
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -singleCompThread -r myscript
To access the MATLAB utility mex, load a MATLAB module. mex depends on shared libraries from GCC Version 4.3.x. This version is not available on Radon, but try the default GCC (MATLAB MEX does not support GCC 4.6 and up). It implements OpenMP. Run a MATLAB client, compile mex_openmp.c into a MATLAB-callable MEX-file, and quit the client:
$ module load matlab/R2011b $ module load gcc/4.6.2 $ matlab -nodisplay -singleCompThread >> mex mex_openmp.c CFLAGS="\$CFLAGS -fopenmp" LDFLAGS="\$LDFLAGS -fopenmp" >> quit; $
The name of the MATLAB-callable MEX-file is mex_openmp.mexa64. If you see the following warning, ignore it:
Warning: You are using gcc version "4.6.2". The version
currently supported with MEX is "4.3.4".
For a list of currently supported compilers see:
http://www.mathworks.com/support/compilers/current_release/
Submit the job while requesting four processor cores on one compute node:
$ qsub -l nodes=1:ppn=4,walltime=00:01:00 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a001.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
myscript.m: hostname:radon-a001.rcac.purdue.edu
SERIAL REGION: Runhost:radon-a001.rcac.purdue.edu Thread:0 of 1 thread hello, world
PARALLEL LOOP: Runhost:radon-a001.rcac.purdue.edu Thread:0 of 4 threads Iteration: 0 hello, world
PARALLEL LOOP: Runhost:radon-a001.rcac.purdue.edu Thread:0 of 4 threads Iteration: 1 hello, world
PARALLEL LOOP: Runhost:radon-a001.rcac.purdue.edu Thread:2 of 4 threads Iteration: 4 hello, world
PARALLEL LOOP: Runhost:radon-a001.rcac.purdue.edu Thread:3 of 4 threads Iteration: 2 hello, world
PARALLEL LOOP: Runhost:radon-a001.rcac.purdue.edu Thread:2 of 4 threads Iteration: 5 hello, world
PARALLEL LOOP: Runhost:radon-a001.rcac.purdue.edu Thread:1 of 4 threads Iteration: 3 hello, world
PARALLEL LOOP: Runhost:radon-a001.rcac.purdue.edu Thread:3 of 4 threads Iteration: 6 hello, world
PARALLEL LOOP: Runhost:radon-a001.rcac.purdue.edu Thread:1 of 4 threads Iteration: 7 hello, world
SERIAL REGION: Runhost:radon-a001.rcac.purdue.edu Thread:0 of 1 thread hello, world
Output shows the name of the compute node (a001) which processed this OpenMP job. Four threads processed the iterations of the parallel loop. Also, this job shared the compute node with other jobs.
Any output written to standard error will appear in myjob.sub.emyjobid.
Rerun this OpenMP job so that its four threads have exclusive access to their compute node:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY export OMP_NUM_THREADS=4 # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -singleCompThread -r myscript
qsub -l nodes=1:ppn=8,walltime=00:01:00 myjob.sub
For the fourth example, prepare a MEX file with a function containing MPI function calls and OpenMP directives and function calls. Use an appropriate filename, here named mex_hybrid.c:
/* FILENAME: mex_hybrid.c */
#include "mex.h"
#include <stdio.h>
#include <mpi.h>
#include <omp.h>
void f () {
/* Serial Region (master thread of an MPI rank) */
/* MPI Parameters */
int rank, size, len;
char name[MPI_MAX_PROCESSOR_NAME];
/* OpenMP Parameters */
int id, nthreads;
/* All ranks initiate the message-passing environment. */
/* Each rank obtains information about itself and its environment. */
MPI_Init(0,0); /* start MPI */
MPI_Comm_size(MPI_COMM_WORLD, &size); /* get number of ranks */
MPI_Comm_rank(MPI_COMM_WORLD, &rank); /* get rank */
MPI_Get_processor_name(name, &len); /* get run-host name */
/* Master thread obtains information about itself and its environment. */
nthreads = omp_get_num_threads(); /* get number of threads */
id = omp_get_thread_num(); /* get thread */
printf("SERIAL REGION: Runhost:%s Rank:%d of %d ranks, Thread:%d of %d thread hello, world\n", name,rank,size,id,nthreads);
/* Open parallel region. */
/* Each thread obtains information about itself and its environment. */
#pragma omp parallel private(name,id,nthreads)
{MPI_Comm_size(MPI_COMM_WORLD, &size); /* get number of ranks */
MPI_Comm_rank(MPI_COMM_WORLD, &rank); /* get rank */
MPI_Get_processor_name(name, &len); /* get run-host name */
nthreads = omp_get_num_threads(); /* get number of threads */
id = omp_get_thread_num(); /* get thread */
printf("PARALLEL REGION: Runhost:%s Rank:%d of %d ranks, Thread:%d of %d threads hello, world\n", name,rank,size,id,nthreads);
}
/* Close parallel region. */
/* Serial Region (master thread) */
printf("SERIAL REGION: Runhost:%s Rank:%d of %d ranks, Thread:%d of %d thread hello, world\n", name,rank,size,id,nthreads);
/* Exit master thread. */
MPI_Finalize(); /* terminate MPI */
return;
}
void mexFunction(int nlhs,mxArray* plhs[], int nrhs, const mxArray* prhs[])
{
/* Check for proper number of arguments. */
if(nrhs!=0) {
mexErrMsgTxt("Zero input required.");
} else if(nlhs>0) {
mexErrMsgTxt("Too many output arguments.");
}
/* Display the names of the compute nodes. */
/* Display the iterations which each thread processes. */
f();
}
Prepare a MATLAB script M-file with an appropriate filename, here named myscript.m:
% FILENAME: myscript.m
% Display the name of the compute node which runs this script.
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('myscript.m: hostname:%s\n', name)
% Call the separately compiled and dynamically linked MEX-file.
% Display the names of compute nodes which run the MPI ranks.
% Display the iterations which each thread processes.
mex_hybrid();
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b module load mvapich2/1.7_gcc-4.4.5 cd $PBS_O_WORKDIR unset DISPLAY # -n: 2 MPI ranks # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator mpiexec -n 2 matlab -nodisplay -singleCompThread -r myscript
To access the MATLAB utility mex, load a MATLAB module. mex depends on shared libraries from GCC Version 4.3.x. This version is not available on Radon, but try the default GCC (MATLAB MEX does not support GCC 4.6 and up). It implements OpenMP. Load GCC Version 4.4.5 with a recent version of MPI-2. Run a MATLAB client, compile mex_hybrid.c
$ module load matlab/R2011b $ module load mvapich2/1.7_gcc-4.4.5 $ matlab -nodisplay -singleCompThread >> mex mex_hybrid.c CC="mpicc" CFLAGS="\$CFLAGS -fopenmp" LDFLAGS="\$LDFLAGS -fopenmp" >> quit; $
The name of the MATLAB-callable MEX-file is mex_hybrid.mexa64.
Submit the job while requesting two compute nodes, each with one MPI rank and four OpenMP threads:
$ qsub -l nodes=2:ppn=4,walltime=00:01:00 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a080.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
For product information, visit www.mathworks.com.
myscript.m: hostname:radon-a080.rcac.purdue.edu
myscript.m: hostname:radon-a080.rcac.purdue.edu
SERIAL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world
PARALLEL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:0 of 2 ranks, Thread:1 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:0 of 2 ranks, Thread:2 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:0 of 2 ranks, Thread:3 of 4 threads hello, world
SERIAL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world
SERIAL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world
PARALLEL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:1 of 2 ranks, Thread:1 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:1 of 2 ranks, Thread:2 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:1 of 2 ranks, Thread:3 of 4 threads hello, world
SERIAL REGION: Runhost:radon-a080.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world
Output shows the name of the compute node (a080) which processed this hybrid job. Both MPI ranks resided on the same compute node. This compute node has enough processor cores to run both MPI ranks. Also, this job shared the compute node with other jobs.
Any output written to standard error will appear in myjob.sub.emyjobid.
Rerun this hybrid job so that each rank with its four threads has exclusive access to its compute node:
% FILENAME: myscript.m
% Display the name of the compute node which runs this script.
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('myscript.m: hostname:%s\n', name)
% Call the separately compiled and dynamically linked MEX-file.
% Display the names of compute nodes which run the MPI ranks.
% Display the iterations which each thread processes.
mex_hybrid();
quit;
!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab /R2011b module load mvapich2/1.7_gcc-4.4.5 cd $PBS_O_WORKDIR unset DISPLAY uniq <$PBS_NODEFILE >nodefile export OMP_NUM_THREADS=4 # -n: 2 MPI ranks # -machinefile: alternate source for compute node names # -nodisplay: run MATLAB in text mode; X11 server not needed # -singleCompThread: turn off implicit parallelism # -r: read MATLAB program; use MATLAB JIT Accelerator mpiexec -n 2 -machinefile nodefile matlab -nodisplay -singleCompThread -r myscript
qsub -l nodes=2:ppn=8,walltime=00:01:00 myjob.sub
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a193.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
For product information, visit www.mathworks.com.
myscript.m: hostname:radon-a193.rcac.purdue.edu
myscript.m: hostname:radon-a194.rcac.purdue.edu
SERIAL REGION: Runhost:radon-a193.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world
PARALLEL REGION: Runhost:radon-a193.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a193.rcac.purdue.edu Rank:0 of 2 ranks, Thread:1 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a193.rcac.purdue.edu Rank:0 of 2 ranks, Thread:2 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a193.rcac.purdue.edu Rank:0 of 2 ranks, Thread:3 of 4 threads hello, world
SERIAL REGION: Runhost:radon-a193.rcac.purdue.edu Rank:0 of 2 ranks, Thread:0 of 1 thread hello, world
SERIAL REGION: Runhost:radon-a194.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world
PARALLEL REGION: Runhost:radon-a194.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a194.rcac.purdue.edu Rank:1 of 2 ranks, Thread:1 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a194.rcac.purdue.edu Rank:1 of 2 ranks, Thread:2 of 4 threads hello, world
PARALLEL REGION: Runhost:radon-a194.rcac.purdue.edu Rank:1 of 2 ranks, Thread:3 of 4 threads hello, world
SERIAL REGION: Runhost:radon-a194.rcac.purdue.edu Rank:1 of 2 ranks, Thread:0 of 1 thread hello, world
Output shows the names of the compute nodes (a193,a194) which processed this hybrid job. Each MPI rank resided on a different compute node.
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about the MATLAB MEX-file:
To see online documentation about MEX-files, enter at the MATLAB command-line prompt:
>> web([docroot '/techdoc/matlab_external/f29322.html#bsabtn2-1'])
A stand-alone MATLAB program is a C, C++, or Fortran program which calls user-written M-files and the same libraries which MATLAB uses. A stand-alone program has access to MATLAB objects, such as the array and matrix classes, as well as all the MATLAB algorithms. If you would like to implement performance-critical routines in C, C++, or Fortran and still call select MATLAB functions, a stand-alone MATLAB program may be a good option. This offers the possibility for substantially improved performance over MATLAB source code, especially for statements like for and while while still allowing use of specialized MATLAB functions where useful.
This section illustrates how to submit a small, stand-alone, MATLAB program to a PBS queue. This C example calls a compiled MATLAB script which computes the inverse of a matrix. This example, when executed, does not use the MATLAB interpreter, so it neither requires nor checks out a MATLAB license.
Prepare a MATLAB function which returns the inverse of a matrix. Use an appropriate filename, here named myinverse.m:
% FILENAME: myinverse.m
function Y = myinverse (X)
% Display name of compute node which runs this function.
[c name] = system('hostname');
fprintf('\n\nhostname:%s\n', name)
% Invert a matrix.
Y = inv(X);
end
Prepare a second MATLAB function which displays a matrix. Use an appropriate filename, here named myprintmatrix.m:
% FILENAME: myprintmatrix.m
function myprintmatrix(A)
disp(A)
end
Prepare a C source file with a main function and the necessary external function interface and give it an appropriate filename, here named myprogram.c. Note that when you invoke a MATLAB function from C, the MATLAB function name appears "mangled". The C program invokes the MATLAB function myinverse using the name mlfMyinverse and the MATLAB function myprintmatrix using the name mlfMyprintmatrix. You must modify all MATLAB function names in this manner when you call them from outside MATLAB:
/* FILENAME: myprogram.c
Inverse of:
A B
------- ------------
1 2 1 1 -3/2 1/2
1 1 1 --> 1 -1 0
3 -1 1 -2 7/2 -1/2
1.0000 -1.5000 0.5000
1.0000 -1.0000 0
-2.0000 3.5000 -0.5000
*/
#include <stdio.h>
#include <math.h>
#include "libmylib.h" /* compiler-generated header file */
int main (const int argc, char ** argv) {
mxArray *A; /* matrix containing */
mxArray *B; /* matrix containing result */
int Nrow=3, Ncol=3;
double a[] = {1,2,1,1,1,1,3,-1,1}; /* row-major order */
double b[] = {1,1,3,2,1,-1,1,1,1}; /* col-major order */
double *ptr;
printf("Enter myprogram.c\n");
libmylibInitialize(); /* call mylib initialization */
/* Make an uninitialized Nrow x Ncol MATLAB matrix. */
A = mxCreateDoubleMatrix(Nrow, Ncol, mxREAL);
/* Initialize the MATLAB matrix. */
ptr = (double *)mxGetPr(A);
memcpy(ptr,b,Nrow*Ncol*sizeof(double));
/* Call mlfMyinverse, the compiled version of myinverse.m. */
mlfMyinverse(1,&B,A);
/* Print the results. */
mlfMyprintmatrix(B);
/* Free the matrices allocated during this computation. */
mxDestroyArray(A);
mxDestroyArray(B);
libmylibTerminate(); /* call mylib initialization */
printf("Exit myprogram.c\n");
return 0;
}
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY ./myprogram
To access the MATLAB Compiler mcc and mbuild, load a MATLAB module. The MATLAB Compiler, mcc, depends on shared libraries from GCC Version 4.3.x. This version is not available on Radon, but GCC Version 4.6.2 is compatible. Compile the user-written, MATLAB functions into a dynamically loaded, shared library. Compile the C program:
$ module load matlab/R2011b $ module load gcc/4.6.2 $ mcc -W lib:libmylib -T link:lib myinverse.m myprintmatrix.m $ mbuild myprogram.c -L. -lmylib -I.
Several new files appear after the compilation:
libmylib.c libmylib.exports libmylib.h libmylib.so mccExcludedFiles.log myinverse myprintmatrix myprogram readme.txt
The name of the compiled, stand-alone MATLAB program is myprogram. The name of the dynamically linked library of user-written MATLAB functions is mylib.
Submit the job:
$ qsub -l nodes=1,walltime=00:01:00 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a145.rcac.purdue.edu
Enter myprogram.c
Warning: No display specified. You will not be able to display graphics on the screen.
Warning: Unable to load Java Runtime Environment: libjvm.so: cannot open shared object file: No such file or directory
Warning: Disabling Java support
Hello, Thomas
hostname:radon-a145.rcac.purdue.edu
1.0000 -1.5000 0.5000
1.0000 -1.0000 0
-2.0000 3.5000 -0.5000
Exit myprogram.c
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about the MATLAB stand-alone program:
The MATLAB Engine allows using MATLAB as a computation engine. A MATLAB Engine program is a standalone C, C++, or Fortran program which calls functions of the Engine Library allowing you to start and end a MATLAB process, send data to and from MATLAB, and send commands to be processed in MATLAB. When employed in this manner, MATLAB is a powerful and programmable mathematical subroutine library.
This section illustrates how to submit a small, stand-alone, MATLAB Engine program to a PBS queue. This C program calls functions of the Engine Library to compute the inverse of a matrix. This example, when executed, does not use the MATLAB interpreter, so it neither requires nor checks out a MATLAB license.
Prepare a C program which computes the inverse of a matrix. Use an appropriate filename, here named myprogram.c:
/* FILENAME: myprogram.c
A simple program to illustrate how to call MATLAB Engine functions
from a C program.
Inverse of:
A B
------- ------------
1 2 1 1 -3/2 1/2
1 1 1 --> 1 -1 0
3 -1 1 -2 7/2 -1/2
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "engine.h"
#define BUFSIZE 256
int main ()
{
Engine *ep;
mxArray *A = NULL;
mxArray *B = NULL;
int Ncol=3, Nrow=3, col, row, ndx;
double a[] = {1,1,3,2,1,-1,1,1,1}; /* col-major order */
double b[9] = {9,9,9,9,9,9,9,9,9};
char buffer[BUFSIZE+1];
printf("Enter myprogram.c\n");
/* Call engOpen with a NULL string. This starts a MATLAB process */
/* on the current host using the command "matlab". */
if (!(ep = engOpen(""))) {
fprintf(stderr, "\nCan't start MATLAB engine\n");
return EXIT_FAILURE;
}
buffer[BUFSIZE] = '\0';
engOutputBuffer(ep, buffer, BUFSIZE);
/* Make a variable for the data. */
A = mxCreateDoubleMatrix(Ncol, Nrow, mxREAL);
B = mxCreateDoubleMatrix(Ncol, Nrow, mxREAL);
memcpy((void *)mxGetPr(A), (void *)a, sizeof(a));
/* Place the variable A into the MATLAB workspace. */
/* Place the variable B into the MATLAB workspace. */
engPutVariable(ep, "A", A);
engPutVariable(ep, "B", B);
/* Evaluate and display the inverse. */
engEvalString(ep, "B = inv(A)");
printf("%s", buffer);
/* Get variable B from the MATLAB workspace. */
/* Copy inverted matrix to a C array named "b". */
B = engGetVariable(ep, "B");
memcpy((void *)b, (void *)mxGetPr(B), sizeof(b));
ndx = 0;
for (col=0;col<Ncol;++col) {
for (row=0;row<Nrow;++row) {
printf(" %5.1f", b[row*Nrow+col]);
++ndx;
}
printf("\n");
}
/* Free memory. */
mxDestroyArray(A);
mxDestroyArray(B);
/* Close MATLAB engine. */
engClose(ep);
/* Exit C program. */
printf("Exit myprogram.c\n");
return EXIT_SUCCESS;
}
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY ./myprogram
Copy MATLAB file engopts.sh to the directory from which you intend to submit Engine jobs. Compile myprogram.c:
$ cp /apps/rhel5/MATLAB/R2011b/bin/engopts.sh . $ mex -f engopts.sh myprogram.c
Submit the job:
$ qsub -l nodes=1,walltime=00:01:00 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a210.rcac.purdue.edu
Enter myprogram.c
>>
B =
1.0000 -1.5000 0.5000
1.0000 -1.0000 0
-2.0000 3.5000 -0.5000
1.0 -1.5 0.5
1.0 -1.0 0.0
-2.0 3.5 -0.5
Exit myprogram.c
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about the MATLAB stand-alone program:
To see online documentation about Engine programs, enter at the MATLAB command-line prompt:
>> web([docroot '/techdoc/matlab_external/f29148.html#f26499'])
MATLAB implements implicit parallelism which, in general, is the exploitation of parallelism that is inherent in many computations, such as matrix multiplication, linear algebra, and performing the same operation on a set of numbers. Implicit parallelism is a form of multithreading which uses hardware to execute efficiently multiple threads. This is different from the explicit parallelism of the Parallel Computing Toolbox. Multithreading aims to increase utilization of a single processor core by using thread-level as well as instruction-level parallelism. A language which provides implicit parallelism might allow the programmer to write the following:
set = [0 1 2 3 4 5 6 7]; result = cos(set);
The language can calculate independently the cosine of each member of the set. The language can spread the computation across available processor cores of a node. The advantage is that a programmer can focus on the problem at hand without worrying over the low-level details of parallelizing the code. Implicit parallelism allows simple code to achieve a substantial improvement in computational performance without additional directives in the programmer's source code.
MATLAB offers implicit parallelism in the form of thread-parallel enabled functions. These functions run on the multicore processors of typical Linux clusters. Since these processor cores, or threads, share a common memory, many MATLAB functions contain multithreading potential. Vector operations, the particular application or algorithm, and the amount of computation (array size) contribute to the determination of whether a function runs serially or with multithreading.
If you have enabled multithreaded computation via File>Preferences>General>Multithreading in R2007a or if multithreading is on by default as it is in releases R2008a and later, you can observe the effect of implicit parallelism with the following example.
Prepare a MATLAB script M-file with thread-parallel enabled vector operations (".*" and ".^"). Use an appropriate filename, here named myscript.m:
% FILENAME: myscript.m
% Implicit Parallelism
warning off all
% Before running, set core count of your compute cluster.
Ncorespernode = 16; % 16 cores per node
Ntest = floor(log2(Ncorespernode))+1;
n = 5000000; % matrix size: 5,000,000
x = zeros(n,1);
del = 2*pi/n;
vectorop = zeros(Ntest,1);
speedup = zeros(Ntest,1);
efficiency = zeros(Ntest,1);
% for-loop implementation
% will not trigger multithreading
tic % start timer
for i=1:n
t = i*del;
x(i) = (sin(t)*exp(-t))^3 + (t^4+5*t^-2)^0.3;
end
forloop = toc; % stop timer
disp(forloop)
% vector implementation
% may trigger multithreading
% depending on the type of computation and work load
for i=1:Ntest
m = 2^(i-1);
maxNumCompThreads(m); % set thread count
tic % start timer
t = (1:n)*del;
x = (sin(t).*exp(-t)).^3 + (t.^4+5*t.^-2).^0.3;
vectorop(i) = toc; % stop timer
speedup(i) = vectorop(1)/vectorop(i);
efficiency(i) = 100*speedup(i)/m;
end
fprintf('N Threads Wall Time Speedup Efficiency\n')
fprintf(' (sec) T1/TN 100*Speedup/N\n')
fprintf('--------- --------- ------- -------------\n')
fprintf('for loop %5.2f N/A N/A\n', forloop)
for i=1:Ntest
fprintf(' %2d %5.2f %4.1f %5.1f\n', 2^(i-1),vectorop(i),speedup(i),efficiency(i))
end
All warnings are off. Using the MATLAB function maxNumCompThreads() will initiate a warning about the deprecation of this function in a future release of MATLAB. For now, the function works as advertized.
Prepare a MATLAB script M-file which submits myscript.m with the MATLAB 'local' configuration and displays the diary. Use an appropriate filename, here named mylclbatch.m:
% FILENAME: mylclbatch.m
!echo "mylclbatch.m"
!hostname
job=batch('myscript','Configuration','local','CaptureDiary',true);
job.wait;
job.diary
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -r mylclbatch
Submit the job as a single node and request exclusive access to a compute node:
$ qsub -l nodes=1:ppn=16,walltime=00:05:00 myjob.sub
View results in the file for all standard output, myjob.sub.omyjobid:
N Threads Wall Time Speedup Efficiency
(sec) T1/TN 100*Speedup/N
--------- --------- ------- -------------
for loop 16.67 N/A N/A
1 4.58 1.0 100.0
2 2.37 1.9 96.8
4 1.22 3.8 93.9
8 0.65 7.1 88.3
16 0.36 12.9 80.5
Results show the performance of a 16-core compute node. First, output shows the significant difference in performance between code with a for loop, which does not trigger implicit parallelism, and code with vector operations, which do trigger implicit parallelism. Secondly, output shows that as the number of threads increase, the wall time decreases. This is the effect of implicit parallelism. Speedup is the ratio of the base time for one thread and the time for N threads. Speedup is not perfect. When the number of threads doubles, the wall time is not quite half. Still, efficiency falls off slowly.
Implicit parallelism comes with disadvantages. It reduces the control that the programmer has over the parallel execution of the program, resulting sometimes in less-than-optimal parallel efficiency. This appears in the speedup column in the example above. When the number of threads increase from one to 16, speedup is noticeably less than 16. Also, implicit parallelism can make debugging difficult.
MATLAB is always greedy about what it can use. Its implicit parallelism discovers how many processor cores physically reside on a compute node and uses all of them. For example, Radon has 8 processor cores per compute node. This number is the return value of maxNumCompThreads() regardless how many processor cores you requested or how many processor cores other jobs are currently using. These three job submissions yield the same value for maxNumCompThreads():
$ qsub -l nodes=1,ppn=8,walltime=00:01:00 myjob.sub $ qsub -l nodes=1:ppn=4,walltime=00:01:00 myjob.sub
When your job triggers implicit parallelism, it attempts to allocate its threads on all processor cores of the compute node on which the MATLAB client is running, including processor cores running other jobs. This competition can degrade the performance of all jobs running on the node. If an affected processor core participates in a larger, distributed-memory, parallel job involving many other nodes, then performance degradation can become much more widespread.
Cluster performance is partially the responsibility of MATLAB users. When you know that you are coding a serial job but are unsure whether you are using thread-parallel enabled operations, run MATLAB with implicit parallelism turned off. Beginning with the R2009b, you can turn multithreading off by starting MATLAB with -singleCompThread:
$ matlab -nodisplay -singleCompThread -r mymatlabprogram
When you are using implicit parallelism, request exclusive access to a compute node by requesting all cores which are physically available on a node of a compute cluster:
$ qsub -l nodes=1,ppn=8,walltime=00:01:00 myjob.sub
Parallel Computing Toolbox commands, such as spmd, preempt multithreading. Note that opening a MATLAB pool neither prevents multithreading nor changes the thread count in effect.
For more information about MATLAB's implicit parallelism:
The MATLAB Parallel Computing Toolbox (PCT) extends the MATLAB language with high-level, parallel-processing features such as parallel for loops, parallel regions, message passing, distributed arrays, and parallel numerical methods. PCT enables task and data parallelism on a multicore processor. It offers a shared-memory computing environment with a maximum of eight MATLAB workers (labs, threads; version R2009a) and 12 workers (labs, threads; version R2011a) running on the local configuration in addition to your MATLAB client. Moreover, the MATLAB Distributed Computing Server (DCS) scales PCT applications up to the limit of your DCS licenses. This section illustrates the fine-grained parallelism of a parallel for loop (parfor) in a pool job. Areas of application include for loops with independent iterations.
This section illustrates eight methods about submitting a small, parallel, MATLAB program with a parallel loop (parfor statement) as a batch, MATLAB pool job to a PBS queue. This MATLAB program prints the name of the run host and shows the values of variables numlabs and labindex for each iteration of the parfor loop. The system function hostname returns two values: a numerical code and the name of the compute nodes that run the iterations of the parallel loop.
The first method runs on a front end a MATLAB client which calls the MATLAB batch() function with a user-defined PBS configuration. Since it uses a PBS configuration, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out seven licenses: one MATLAB license for the client running on the front end, one PCT license, and five DCS licenses. One DCS license runs the batch session (including the two serial regions in the MATLAB M-code) while the other four run the iterations of the parfor loop. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running function batch() and quitting MATLAB. The five DCS licenses remain active between running function batch() and job completion, even when the serial portion of the program is running. The DCS licenses do not appear in the output of function license().
The second method uses the PBS qsub command to submit to a compute node a MATLAB client which calls the MATLAB batch() function with the 'local' configuration. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. This job is completely off the front end.
The third method runs on a front end a MATLAB client which calls the MATLAB submit() function with the 'torque' scheduler. When you need more granularity in your submission, use submit(). Since it uses the 'torque' scheduler, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out seven licenses: one MATLAB license for the client running on the front end, one PCT license, and five DCS licenses. One DCS license runs the batch session (including the two serial regions in the MATLAB M-code) while the other four run the iterations of the parfor loop. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. The five DCS licenses remain active between running function submit() and job completion, even when the serial portion of the program is running. The DCS licenses do not appear in the output of function license().
The fourth method uses the PBS qsub command to submit to a compute node a MATLAB client which calls the MATLAB submit() function with the 'local' scheduler. When you need more granularity in your submission, use submit(). Since it uses the 'local' scheduler, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. This job is completely off the front end.
The fifth method uses the PBS qsub command to submit to compute nodes a MATLAB client which interprets an M-file with a user-defined PBS configuration which scatters the MATLAB workers onto different compute nodes. Since it uses a PBS configuration, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out six licenses: one MATLAB license for the client running on the compute node, one PCT license, and four DCS licenses. Four DCS licenses run the iterations of the parfor loop. This job is completely off the front end.
The sixth method uses the PBS qsub command to submit to a compute node a MATLAB client which interprets an M-file with the 'local' configuration. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter; so, it requires and checks out one license: one MATLAB license for the client running on a compute node. The MATLAB license remains active between starting and quitting MATLAB. This job is completely off the front end.
The seventh method uses the MATLAB compiler mcc and the default parallel configuration set to a PBS configuration to compile a MATLAB M-file and submits the compiled file to a PBS queue. It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. This method uses a PBS configuration during compilation. Since it uses a PBS configuration, this method, when executed, uses the MATLAB Distributed Computing Server; so, it requires and checks out four DCS licenses during the execution of the parfor statement. The serial portions of this job do not use a DCS license. This job is completely off the front end.
The eighth method uses the MATLAB Compiler mcc and the default parallel configuration set to the 'local' configuration to compile a MATLAB M-file and submits the compiled file to a PBS queue. It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. Since it uses the 'local' configuration, this method, when executed, uses no license. (Support for running compiled PCT code on the local configuration was added in R2011a; this feature removes the need for DCS licenses in some cases.) This job is completely off the front end.
The MATLAB Compiler license is a lingering license. Using the compiler locks its license for at least 30 minutes. Each time you issue the compiler command, you reset the 30-minute timer. When running mcc at the MATLAB prompt, you hold the license as long as MATLAB remains open. To release the license, quit MATLAB. Quitting MATLAB is necessary to release the license, but not sufficient. When you quit MATLAB before the 30-minute timer runs out, then the license remains checked out for the remaining time. When running mcc at the Linux prompt, the 30-minute timer starts or restarts. The license manager tracks MATLAB licenses per user per cluster. If you run the MATLAB Compiler on two different clusters, you use two Compiler licenses. To minimize your license usage, run the Compiler on one cluster.
You can share your compiled program with colleagues who do not have MATLAB licenses. Your compiled program will take its required licenses within itself. So, you do not need to install any license file on your colleague's machine.
The following table summarizes MATLAB license usage:
| Method | Description | MATLAB | PCT | DCS | mcc | Limitations |
|---|---|---|---|---|---|---|
| 1 | batch() with user-defined PBS configuration | 1 | 1 | Matlabpool + 1 | 0 | number of MATLAB,PCT,DCS licenses purchased |
| 2 | batch() with 'local' configuration, qsub | 1 | 1 | 0 | 0 | local configuration with 8 (R2009a) and 12 (R2011a) workers |
| 3 | submit() with 'torque' scheduler | 1 | 1 | MaximumNumberOfWorkers | 0 | number of MATLAB,PCT,DCS licenses purchased |
| 4 | submit() with 'local' scheduler, qsub | 1 | 1 | 0 | 0 | local scheduler with 8 (R2009a) and 12 (R2011a) workers |
| 5 | qsub with user-defined PBS configuration | 1 | 1 | pool size | 0 | number of MATLAB,PCT,DCS licenses purchased |
| 6 | qsub with 'local' configuration | 1 | 1 | 0 | 0 | local configuration with 8 (R2009a) and 12 (R2011a) workers |
| 7 | Compiler with user-defined PBS configuration, qsub | 0 | 0 | pool size | 1 | number of DCS licenses purchased |
| 8 | Compiler with the 'local' configuration, qsub | 0 | 0 | 0 | 1 | local configuration with 8 (R2009a) and 12 (R2011a) workers |
Prepare a MATLAB pool program in the form of a MATLAB script M-file and a MATLAB function M-file with appropriate filenames, here named myscript.m and myfunction.m:
% FILENAME: myscript.m
% SERIAL REGION
[c name] = system('hostname');
fprintf('SERIAL REGION: hostname:%s\n', name)
numlabs = matlabpool('size');
fprintf(' hostname numlabs labindex iteration\n')
fprintf(' ------------------------------- ------- -------- ---------\n')
tic;
% PARALLEL LOOP
parfor i = 1:8
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('PARALLEL LOOP: %-31s %7d %8d %9d\n', name,numlabs,labindex,i)
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel loop
fprintf('\n')
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('SERIAL REGION: hostname:%s\n', name)
fprintf('Elapsed time in parallel loop: %f\n', elapsed_time)
% FILENAME: myfunction.m
function result = myfunction ()
% SERIAL REGION
% Variable "result" is a "reduction" variable.
[c name] = system('hostname');
result = sprintf('SERIAL REGION: hostname:%s', name);
numlabs = matlabpool('size');
r = sprintf(' hostname numlabs labindex iteration');
result = strvcat(result,r);
r = sprintf(' ------------------------------- ------- -------- ---------');
result = strvcat(result,r);
tic;
% PARALLEL LOOP
parfor i = 1:8
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('PARALLEL LOOP: %-31s %7d %8d %9d', name,numlabs,labindex,i);
result = strvcat(result,r);
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel loop
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('\nSERIAL REGION: hostname:%s', name);
result = strvcat(result,r);
r = sprintf('Elapsed time in parallel loop: %f', elapsed_time);
result = strvcat(result,r);
end
Both M-files display the names of all compute nodes which run the job. The parfor statement does not set the values of variables numlabs or labindex, but frunction matlabpool() can return the pool size. The script M-file uses fprintf() to display the results. The function M-file returns a single value which contains a concatenation of the results.
The execution of a pool job starts with a worker (batch session) executing the statements of the first serial region up to the parfor block, when it pauses. A set of workers (the pool) executes the parfor block. When they finish, the batch session resumes by executing the second serial region. The code displays the names of the compute nodes running the batch session and the worker pool.
The first method of job submission uses function batch() to offload from the MATLAB client running on the front end to a batch session running on a worker (compute node) the responsibilities of defining the parfor loop, which runs on another set of workers called the pool, and accumulating the results. The batch session and the pool cooperate on processing a single program. The batch session distributes the independent iterations of the loop to the workers of the pool. The workers of the pool process simultaneously their respective portions of the workload of the parallel loop so that the parallel loop might run faster than the equivalent serial version. A pool size of N requires N+1 workers (processor cores). The source code is a MATLAB M-file (MATLAB function batch() accepts either a script M-file or a function M-file).
On the front end, load a MATLAB module and verify the version of the MATLAB module loaded. Run a MATLAB client. At the MATLAB prompt, view the current default parallel configuration; most likely, it is the 'local' configuration. Call MATLAB function batch() to make a four-lab pool on which to run the MATLAB code in the file myscript.m. In the call, replace the 'local' configuration by specifying your PBS configuration which you previously designed with the Configuration Manager (using the 'local' configuration at this point will run the parfor loop on the front end). This particular PBS configuration scatters the labs to different compute nodes to verify that a four-lab parfor loop actually uses five processor cores. Also, capture job output in the diary. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). After your job finishes, view results by viewing the diary. When you do not want to keep the files of your job, use MATLAB function destroy() to erase them from your current working directory. After this job finishes, the default parallel configuration remains the 'local' configuration.
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay
>> disp(defaultParallelConfig);
local
>> pjob=batch('myscript','Matlabpool',4,'Configuration','mypbsconfig','CaptureDiary',true);
>> !qstat -u myusername
>> disp(pjob.get('State'))
queued
>> disp(pjob.get('State'))
running
>> license('inuse')
distrib_computing_toolbox
matlab
>> disp(pjob.get('State'))
finished
>> license('inuse')
distrib_computing_toolbox
matlab
>> pjob.diary;
>> ls -l
>> pjob.destroy;
>> ls -l
>> disp(defaultParallelConfig);
local
>> quit;
$
The license name distrib_computing_toolbox refers to the Parallel Computing Toolbox.
View job status from qstat:
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115204.radon-a myusername standby Job1 -- 5 5 -- 00:01 Q --
Job status shows that the MATLAB client submitted this job as five compute nodes (NDS), each with one processor core (TSK) and with a requested wall time of one minute. The call to function batch() specifies four labs to evaluate the iterations of the parallel loop (parfor statement). The fifth lab runs the batch session, myscript.m, defines the parfor loop, assigns loop iterations to the other four labs, and accumulates the results. This arrangement explains the presence of five DCS licenses.
View job output from the diary:
SERIAL REGION: hostname:radon-a008.rcac.purdue.edu
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a057.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a073.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a074.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a075.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a057.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a073.rcac.purdue.edu 4 1 3
PARALLEL LOOP: radon-a074.rcac.purdue.edu 4 1 7
PARALLEL LOOP: radon-a075.rcac.purdue.edu 4 1 8
SERIAL REGION: hostname:radon-a008.rcac.purdue.edu
Elapsed time in parallel loop: 5.585185
Output shows that the nonnegative scalar integer which is the value of the property Matlabpool (4) defined the number of labs in the pool which processed the parallel loop. Also, output shows the fifth lab which runs the batch session, which includes the two serial portions of the MATLAB pool program. Because the MATLAB pool requires the lab running the batch session in addition to N labs in the pool, there must be at least N+1 processor cores available on the cluster.
The MATLAB client "scattered" the five compute nodes (five MATLAB labs) among five different compute nodes. A processor core of one compute node (a008) processed the batch session, including the two serial regions, while four processor cores of four different compute nodes (a057,a073,a074,a075) processed the iterations of the parallel loop. The parfor loop does not set variable numlabs to the number of labs in the pool; nor does it give to each lab in the pool a unique value for variable labindex. Output shows the iterations of the parfor loop in scrambled order since the labs process each iteration independently of the other iterations. You cannot assume that since there are four processor cores and eight iterations, each processor core processes two consecutive iterations of the parfor loop. One compute node (a074) processed two nonconsecutive iterations: iterations 5 and 7. While this example evenly distributed the iterations among the four labs, MATLAB may not use this schedule in other situations. Finally, output shows the time that the four labs spent running the eight iterations of the parfor loop. What proves that several processor cores are processing in parallel the eight iterations of the parfor loop is that the processing time decreases as the number of processor cores increases.
Running this example with larger MATLAB pool sizes yields shorter runtimes:
| Pool Size | Time (seconds) |
|---|---|
| 1 | 18.1 |
| 2 | 9.2 |
| 4 | 5.0 |
| 8 | 3.6 |
After calling the MATLAB function batch(), you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with the PBS command qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy().
$ module load matlab/R2011b
$ matlab -nodisplay
>> sched=findResource('scheduler','Configuration','mypbsconfig');
>> pjob=findJob(sched,'State','finished');
>> pjob.diary;
>> pjob.destroy;
>> quit;
$
To apply the first method of job submission to a function M-file, use one of the following sequences:
>> pjob=batch('myfunction','Matlabpool',4,'Configuration','mypbsconfig','CaptureDiary',true);
>> disp(pjob.get('State'))
finished
>> pjob.diary
>> pjob=batch('myfunction',1,{},'Matlabpool',4,'Configuration','mypbsconfig');
>> disp(pjob.get('State'))
finished
>> result=getAllOutputArguments(pjob);
>> result{1}
>> pjob=batch(@myfunction,1,{},'Matlabpool',4,'Configuration','mypbsconfig');
>> disp(pjob.get('State'))
finished
>> result=getAllOutputArguments(pjob);
>> result{1}
Function batch() offers an alternate mode of operation. If you exclude the property Configuration and its value from the list of arguments of function batch(), then batch() reads the default parallel configuration. Before calling batch(), set the default parallel configuration with your PBS configuration which you previously designed with the Configuration Manager. This change of the default parallel configuration is immediate and permanent; the PBS configuration remains the default parallel configuration during subsequent runs of MATLAB. MATLAB function batch() reads the default parallel configuration to discover your instructions about submission.
To scale up this method to handle a real application, use the Configuration Manager in the Parallel menu to increase the wall time to accommodate a longer running job. Enter a new wall time in the property SubmitArguments. Also, consider increasing the size of the MATLAB pool, the value of the property Matlabpool which appears as an argument in the call of function batch(). The maximum possible size of the pool is one less than the number of DCS licenses purchased.
The second method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar to the first method since it uses function batch() and the same MATLAB M-file, either myscript.m or myfunction.m. This method differs from the first because the MATLAB client runs on a compute node rather than on the front end and the client uses the MATLAB 'local' configuration rather than a user-defined PBS configuration.
Prepare a MATLAB script M-file that calls MATLAB function batch() which makes a four-lab pool on which to run the MATLAB code in the file myscript.m, which specifies the 'local' configuration, and which captures job output in the diary. Use an appropriate filename, here named mylclbatch.m:
% FILENAME: mylclbatch.m
!echo "mylclbatch.m"
!hostname
pjob=batch('myscript','Matlabpool',4,'Configuration','local','CaptureDiary',true);
pjob.wait;
pjob.diary
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -r mylclbatch
Submit the job as a single compute node with six processor cores and request one PCT license:
$ qsub -l nodes=1:ppn=6,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
One processor core runs myjob.sub and mylclbatch.m; one processor core runs the two serial regions of the MATLAB M-file; four processor cores run the iterations of the parallel for loop.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
99025.radon-ad myusername standby myjob.sub 30197 1 6 -- 00:01 R 00:00
Job status shows six processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a000.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
mylclbatch.m
radon-a000.rcac.purdue.edu
SERIAL REGION: hostname:radon-a000.rcac.purdue.edu
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 3
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 7
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 8
SERIAL REGION: hostname:radon-a000.rcac.purdue.edu
Elapsed time in parallel loop: 5.411486
Output shows that the nonnegative scalar integer which is the value of the property Matlabpool (4) defined the number of labs in the pool which processed the parallel for loop. While output does not explicitly show the fifth lab, that lab runs the batch session, which includes the two serial portions of the MATLAB pool program. Because the MATLAB pool requires the worker running the batch session in addition to N labs in the pool, there must be at least N+1 processor cores available on the cluster.
Output shows that processor cores on one compute node (a000) processed the entire job. One processor core processed myjob.sub and mylclbatch.m. One processor core processed the batch session myscript.m, which includes the two serial regions, while four processor cores processed the iterations of the parallel loop. The parfor loop does not set variable numlabs to the number of labs in the pool; nor does it give to each lab in the pool a unique value for variable labindex. Output shows the iterations of the parfor loop in scrambled order since the labs process each iteration independently of the other iterations. You cannot assume that since there are four processor cores and eight iterations, each processor core processes two consecutive iterations of the parfor loop. While this example evenly distributed the iterations among the four labs, MATLAB may not use this schedule in other situations. Finally, output shows the time that the four labs spent running the eight iterations of the parfor loop. What proves that several processor cores are processing in parallel the eight iterations of the parfor loop is that the processing time decreases as the number of processor cores increases.
Any output written to standard error will appear in myjob.sub.emyjobid.
To apply the second method of job submission to a function M-file, modify mylclbatch.m with one of the following sequences:
pjob=batch('myfunction','Matlabpool',4,'Configuration','local','CaptureDiary',true);
pjob.wait;
pjob.diary
>> pjob=batch('myfunction',1,{},'Matlabpool',4,'Configuration','local');
pjob.wait;
result = getAllOutputArguments(pjob);
result{1}
pjob=batch(@myfunction,1,{},'Matlabpool',4,'Configuration','local');
pjob.wait;
result = getAllOutputArguments(pjob);
result{1}
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the property Matlabpool which appears as an argument in the call of function batch(). The maximum possible size of the pool is one less than the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be two greater than the value of Matlabpool.
Specifying a MATLAB pool with 12 labs means a total of 13 workers. This exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow:
pjob=batch('myscript','Matlabpool',12,'Configuration','local','CaptureDiary',true);
$ qsub -l nodes=1:ppn=14,walltime=00:05:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using batch (line 172)
You requested a minimum of 13 workers but only 12 workers are allowed with the
local scheduler.
Error in mylclbatch (line 6)
pjob=batch('myscript','Matlabpool',12,'Configuration','local','CaptureDiary',true);}
The third method of job submission uses function submit() to offload from the MATLAB client running on the front end to a batch session running on a worker (compute node) the responsibilities of defining the parfor loop, which runs on another set of workers called the pool, and accumulating the results. The batch session and the pool cooperate on processing a single program. The batch session distributes the independent iterations of the loop to the workers of the pool. The workers of the pool process simultaneously their respective portions of the workload of the parallel loop so that the parallel loop might run faster than the equivalent serial version. A pool size of N requires N+1 workers (processor cores). The source code is a MATLAB function M-file (MATLAB function submit() accepts only a function M-file).
Prepare a MATLAB script M-file which finds the MATLAB 'torque' scheduler, which specifies scattering five processor cores to five different compute nodes and one minute of wall time for this short job and the number of PCT and DCS licenses, which defines a MATLAB pool job with five workers (labs) and one task, and which calls MATLAB function submit(). Use an appropriate filename, here named mypbssubmit.m:
% FILENAME: mypbssubmit.m
sched = findResource('scheduler', 'type', 'torque');
set(sched,'ClusterMatlabRoot',matlabroot);
set(sched,'SubmitArguments','-l walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+5');');
pjob = createMatlabPoolJob(sched);
set(pjob,'MinimumNumberOfWorkers',5);
set(pjob,'MaximumNumberOfWorkers',5);
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
On a front end, load a MATLAB module and verify the version of the MATLAB module loaded. Run a MATLAB client. At the MATLAB prompt, run mypbssubmit. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). After your job finishes, get all output arguments into a cell array and display the results. When you do not want to keep the files of a job, use MATLAB function destroy() to erase them from your current working directory.
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay
>> mypbssubmit
FINISHED SUBMITTING
>> !qstat -u myusername
>> disp(pjob.get('State'))
queued
>> disp(pjob.get('State'))
running
>> disp(pjob.get('State'))
finished
>> result = getAllOutputArguments(pjob)
result =
[13x77 char]
>> result{1}
>> ls -l
>> job.destroy;
>> ls -l
>> quit
$
View job status from qstat:
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115265.radon-a myusername standby Job1 -- 5 5 -- 00:01 Q --
Job status shows that the MATLAB client submitted this job as five compute nodes (NDS), each with one processor core (TSK), and the requested wall time of one minute. The call to function submit() specifies five labs as the minimum and maximum number of labs of the MATLAB pool. Four labs evaluate the iterations of the parfor loop. The fifth lab runs the batch session, myfunction.m, including the two serial regions, defines the parfor loop, assigns loop iterations to the other four labs, and accumulates the results. This arrangement explains the presence of five DCS licenses.
View job output:
SERIAL REGION: hostname:radon-a001.rcac.purdue.edu
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a009.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a009.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 7
PARALLEL LOOP: radon-a013.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a013.rcac.purdue.edu 4 1 8
PARALLEL LOOP: radon-a010.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a010.rcac.purdue.edu 4 1 3
SERIAL REGION: hostname:radon-a001.rcac.purdue.edu
Elapsed time in parallel loop: 4.904323
Output shows that the nonnegative scalar integer which is the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers (5) defined the number of labs in the pool (4) which processed the parallel loop. Also, output shows the fifth lab which runs the batch session, which includes the two serial portions of the MATLAB pool program. Because the MATLAB pool requires N labs including the lab running the batch session, there must be at least N processor cores available on the cluster.
The MATLAB client scattered the five processor cores (five MATLAB labs) among five different compute nodes. A processor core of one compute node (a001) processed the batch session, including the two serial regions, while four processor cores of four different compute nodes (a009,a010,a012,a013) processed the parallel loop. The parfor loop does not set variable numlabs to the number of labs in the pool; nor does it give to each lab in the pool a unique value for variable labindex. Output shows the iterations of the parfor loop in scrambled order since the labs process each iteration independently of the other iterations. You cannot assume that since there are four processor cores and eight iterations, each processor core processes two consecutive iterations of the parfor loop. Two compute nodes (a012 and a013) processed two nonconsecutive iterations. While this example evenly distributed the iterations among the four labs, you cannot assume that MATLAB will use this schedule in other situations. Finally, output shows the time that the four labs spent running the eight iterations of the parfor loop. What proves that several processor cores are processing in parallel the eight iterations of the parfor loop is that the processing time decreases as the number of processor cores increases.
After the "FINISHED SUBMITTING" message, you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with PBS command qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy().
$ module load matlab/R2011b
$ matlab -nodisplay
>> sched=findResource('scheduler','type','torque');
>> pjob=findJob(sched,'State','finished');
>> result=getAllOutputArguments(pjob);
>> result{1}
>> job.destroy;
>> quit
$
For practice, modify mypbssubmit.m to rerun this example as a single compute node with five processor cores:
set(sched,'SubmitArguments','-l nodes=1:ppn=5,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+5');
View job status from qstat:
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115273.radon-a myusername standby Job1 -- 1 5 -- 00:01 Q --
The MATLAB client submitted this job as a single compute node (NDS) with five processor cores (TSK). The lab that runs the batch session and the four labs that run the parfor loop reside on the same compute node.
View job output:
SERIAL REGION: hostname:radon-a015.rcac.purdue.edu
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a015.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a015.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a015.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a015.rcac.purdue.edu 4 1 3
PARALLEL LOOP: radon-a015.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a015.rcac.purdue.edu 4 1 8
PARALLEL LOOP: radon-a015.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a015.rcac.purdue.edu 4 1 7
SERIAL REGION: hostname:radon-a015.rcac.purdue.edu
Elapsed time in parallel loop: 4.926231
Output shows that processor cores of one compute node (a015) processed the entire job.
Function submit() offers an alternate mode of operation. It can read your PBS configuration made previously with the Configuration Manager. Here is an alternate version of mypbssubmit.m which relies on mypbsconfig for the several details of job submission:
% FILENAME: mypbssubmit.m
sched = findResource('scheduler','Configuration','mypbsconfig');
pjob = createMatlabPoolJob(sched);
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
To scale up this method to handle a real application, increase the wall time in mypbssubmit.m to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers which appear as arguments in the calls of function set(). The maximum possible size of the pool is the number of DCS licenses purchased. Finally, increase the value of MATLAB_Distrib_Comp_Server to match the value of MinimumNumberOfWorkers.
The fourth method of job submission uses the PBS qsub command to submit a pool job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar the third method since it uses function submit() and the same MATLAB function M-file myfunction.m (MATLAB function submit() accepts only a function M-file). This method differs from the third because the MATLAB client runs on a compute node rather than on the front end and the client uses the MATLAB 'local' scheduler rather than the MATLAB 'torque' scheduler.
Prepare a MATLAB script M-file which finds the MATLAB 'local' scheduler, which defines a MATLAB pool job with five workers (labs) and one task, and which calls MATLAB function submit(). Use an appropriate filename, here named mylclsubmit.m:
% FILENAME: mylclsubmit.m
!echo "mylclsubmit.m"
!hostname
sched = findResource('scheduler', 'type', 'local');
set(sched,'ClusterMatlabRoot',matlabroot);
pjob = createMatlabPoolJob(sched);
set(pjob,'MinimumNumberOfWorkers',5)
set(pjob,'MaximumNumberOfWorkers',5)
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
pjob.wait;
result = getAllOutputArguments(pjob);
result{1}
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -r mylclsubmit
Submit the job as a single compute node with six processor cores and request one PCT license:
$ qsub -l nodes=1:ppn=6,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
One processor core runs myjob.sub and mylclsubmit.m; one processor core runs the two serial regions of the batch session; four processor cores run the iterations of the parallel for loop.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115280.radon-ad myusername standby myjob.sub 19225 1 6 -- 00:01 R 00:00
Job status shows six processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a012.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
mylclsubmit.m
radon-a012.rcac.purdue.edu
FINISHED SUBMITTING
ans =
SERIAL REGION: hostname:radon-a012.rcac.purdue.edu
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 7
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 8
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a012.rcac.purdue.edu 4 1 3
SERIAL REGION: hostname:radon-a012.rcac.purdue.edu
Elapsed time in parallel loop: 6.203370
Output shows that the nonnegative scalar integer which is the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers (5) defined the number of labs in the pool (4) which processed the parallel loop. While output does not explicitly show the fifth lab, that lab runs the batch session, which includes the two serial portions of the MATLAB pool program. Because the MATLAB pool requires N workers including the worker running the batch session, there must be at least N processor cores available on the cluster.
The processor cores of one compute node (a012) processed the entire job. One processor core processed myjob.sub and mylclbatch.m. One processor core processed the batch session myfunction.m, which includes the two serial regions, while four processor cores processed the parallel for loop. The parfor loop does not set variable numlabs to the number of labs in the pool; nor does it give to each lab in the pool a unique value for variable labindex. The scrambled order of the iterations displayed in the output comes from the parallel nature of the parfor loop; labs process each iteration independently of the other iterations, so output from the iterations is in random order. You cannot assume that since there are four processor cores and eight iterations, each processor core processes two iterations of the parfor loop. While this example evenly distributed the iterations among the four labs, MATLAB may not use this schedule in other situations. Finally, output shows the time that the four labs spent running the eight iterations of the parfor loop. What proves that several processor cores are processing in parallel the eight iterations of the parfor loop is that the processing time decreases as the number of processor cores increases.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers which appear as arguments in the calls of function set(). The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be one greater than the value of MaximumNumberOfWorkers.
Specifying 13 workers to achieve a MATLAB pool with 12 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow:
set(pjob,'MinimumNumberOfWorkers',13);
set(pjob,'MaximumNumberOfWorkers',13);
$ qsub -l nodes=1:ppn=14,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using distcomp.simpleparalleljob/pSetMinimumNumberOfWorkers (line 59)
MinimumNumberOfWorkers must be the same as or less than MaximumNumberOfWorkers
for a job
Error in mylclsubmit (line 9)
set(pjob,'MinimumNumberOfWorkers',13);
The fifth method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar to the first and third methods since it uses either a MATLAB script M-file or a MATLAB function M-file. Like the first and third methods, this method uses a user-defined PBS configuration. It differs from the other two methods because the script is slightly different. Two MATLAB statements specify a MATLAB pool since this method uses neither batch() nor submit() to make the MATLAB pool. Also, the MATLAB script M-file must quit at the end of the batch process. Another significant difference: This method uses one fewer DCS license than the first and third methods.
Modify the MATLAB script M-file myscript.m with matlabpool and quit statements or the MATLAB function M-file myfunction.m with matlabpool statements:
% FILENAME: myscript.m
% SERIAL REGION
[c name] = system('hostname');
fprintf('SERIAL REGION: hostname:%s\n', name)
matlabpool open 4;
numlabs = matlabpool('size');
fprintf(' hostname numlabs labindex iteration\n')
fprintf(' ------------------------------- ------- -------- ---------\n')
tic;
% PARALLEL LOOP
parfor i = 1:8
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('PARALLEL LOOP: %-31s %7d %8d %9d\n', name,numlabs,labindex,i)
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel loop
matlabpool close force;
fprintf('\n')
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('SERIAL REGION: hostname:%s\n', name)
fprintf('Elapsed time in parallel loop: %f\n', elapsed_time)
quit;
% FILENAME: myfunction.m
function result = myfunction ()
% SERIAL REGION
% Variable "result" is a "reduction" variable.
[c name] = system('hostname');
result = sprintf('SERIAL REGION: hostname:%s', name);
matlabpool open 4;
numlabs = matlabpool('size');
r = sprintf(' hostname numlabs labindex iteration');
result = strvcat(result,r);
r = sprintf(' ------------------------------- ------- -------- ---------');
result = strvcat(result,r);
tic;
% PARALLEL LOOP
parfor i = 1:8
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('PARALLEL LOOP: %-31s %7d %8d %9d', name,numlabs,labindex,i);
result = strvcat(result,r);
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel loop
matlabpool close force;
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('\nSERIAL REGION: hostname:%s', name);
result = strvcat(result,r);
r = sprintf('elapsed time: %f', elapsed_time);
result = strvcat(result,r);
end
Prepare a job submission file with an appropriate filename, here named myjob.sub. Run with the name of either the script M-file or the function M-file:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -r myscript # matlab -nodisplay -r myfunction
Run MATLAB to set the default parallel configuration to your PBS configuration:
$ matlab -nodisplay
>> defaultParallelConfig('mypbsconfig');
>> quit;
$
Submit the job as a single compute node with one processor core and request one PCT license and four DCS licenses:
$ qsub -l nodes=1:ppn=1,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+4 myjob.sub
This job submission causes a second job submission.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
332026.radon-ad myusername standby myjob.sub 31850 1 1 -- 00:01 R 00:00
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
332026.radon-ad myusername standby myjob.sub 31850 1 1 -- 00:01 R 00:00
332028.radon-ad myusername standby Job1 668 4 4 -- 00:01 R 00:00
At first, job status shows one processor core (TSK) on one compute node (NDS). Then, job status shows four processor cores (TSK) on four compute nodes (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a000.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
SERIAL REGION: hostname:radon-a000.rcac.purdue.edu
Starting matlabpool using the 'mypbsconfig' configuration ... connected to 4 labs.
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a008.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a008.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a009.rcac.purdue.edu 4 1 3
PARALLEL LOOP: radon-a009.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a010.rcac.purdue.edu 4 1 7
PARALLEL LOOP: radon-a010.rcac.purdue.edu 4 1 8
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
SERIAL REGION: hostname:radon-a000.rcac.purdue.edu
Elapsed time in parallel region: 3.382151
Output shows the name of the compute node (a000) that processed the job submission file myjob.sub and the two serial regions. The job submission "scattered" among four different compute nodes (a007,a008,a009,a010) the four compute nodes (four MATLAB labs) that processed the iterations of the parallel loop. The parfor loop does not set variable numlabs to the number of labs in the pool; nor does it give to each lab in the pool a unique value for variable labindex. The scrambled order of the iterations displayed in the output comes from the parallel nature of the parfor loop; labs process each iteration independently of the other iterations, so output from the iterations is in random order. You cannot assume that since there are four processor cores and eight iterations, each processor core processes two iterations of the parfor loop. While this example evenly distributed the iterations among the four labs, MATLAB may not use this schedule in other situations. Finally, output shows the time that the four labs spent running the eight iterations of the parfor loop. What proves that several processor cores are processing in parallel the eight iterations of the parfor loop is that the processing time decreases as the number of processor cores increases.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Secondly, increase the wall time of mypbsconfig by using the Configuration Manager in the Parallel menu to enter a new wall time in the property SubmitArguments. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of DCS licenses purchased.
The sixth method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is like the fifth method since it uses the same MATLAB M-files and the same job submission file myjob.sub. This method differs from the fifth since it uses the MATLAB 'local' configuration rather than a PBS configuration.
Run MATLAB to set the default parallel configuration to the MATLAB 'local' configuration:
$ matlab -nodisplay
>> defaultParallelConfig('local');
>> quit;
$
Submit the job as a single compute node with five processor cores and request one PCT license:
$ qsub -l nodes=1:ppn=5,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115287.radon-ad myusername standby myjob.sub 24010 1 5 -- 00:01 R 00:00
Job status shows five processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a007.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
SERIAL REGION: hostname:radon-a007.rcac.purdue.edu
Starting matlabpool using the 'local' configuration ... connected to 4 labs.
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 3
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 7
PARALLEL LOOP: radon-a007.rcac.purdue.edu 4 1 8
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
SERIAL REGION: hostname:radon-a007.rcac.purdue.edu
Elapsed time in parallel loop: 4.783794
Output shows that processor cores on one compute node (a007) processed the entire job. Output also shows that a "matlabpool using the 'local' configuration" is connected to four MATLAB labs. The parfor loop does not set variable numlabs to the number of labs in the pool; nor does it give to each lab in the pool a unique value for variable labindex. The scrambled order of the iterations displayed in the output comes from the parallel nature of the for loop; labs process each iteration independently of the other iterations, so output from the iterations is in random order. You cannot assume that since there are four processor cores and eight iterations, each processor core processes two iterations of the parfor loop. While this example evenly distributed the iterations among the four labs, MATLAB may not use this schedule in other situations. Finally, output shows the time that the four labs spent running the eight iterations of the parfor loop. What proves that several processor cores are processing in parallel the eight iterations of the parfor loop is that the processing time decreases as the number of processor cores increases.
Running this example with larger MATLAB pool sizes yields shorter runtimes:
| Pool Size | Time (seconds) |
|---|---|
| 1 | 17.2 |
| 2 | 9.0 |
| 4 | 4.8 |
| 8 | 3.8 |
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be one greater than the value in matlabpool open.
Specifying a MATLAB pool with 13 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow:
matlabpool open 13;
$ qsub -l nodes=1:ppn=13,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using matlabpool (line 136)
Failed to open matlabpool. (For information in addition to the causing error,
validate the configuration 'local' in the Configurations Manager.)
Error in myscript (line 6)
matlabpool open 13;
Caused by:
Error using distcomp.interactiveclient/start (line 88)
Failed to start matlabpool.
This is caused by:
You requested a minimum of 13 workers but only 12 workers are allowed with
the local scheduler.
The seventh method of job submission uses the MATLAB Compiler mcc to compile a MATLAB function M-file with a PBS configuration and submits the compiled file to a PBS queue.
This method is similar to the third method since it uses a MATLAB function M-file. Like the third method, this method uses a user-defined PBS configuration. It differs from the third method because the script is slightly different. Two MATLAB statements specify a MATLAB pool since this method uses neither batch() nor submit() to make the MATLAB pool. Also, the MATLAB script M-file must quit at the end of the batch process. Another significant difference: This method uses one fewer DCS license than the third method.
Modify the MATLAB script M-file myscript.m with matlabpool and quit statements or the MATLAB function M-file myfunction.m with matlabpool statements. Proceed with the MATLAB function M-file myfunction.m (when compiling a parfor statement, the parfor must be in a function, not in a script; this is a bug in MATLAB):
% FILENAME: myscript.m
warning off all;
% SERIAL REGION
[c name] = system('hostname');
fprintf('SERIAL REGION: hostname:%s\n', name)
matlabpool open 4;
numlabs = matlabpool('size');
fprintf(' hostname numlabs labindex iteration\n')
fprintf(' ------------------------------- ------- -------- ---------\n')
tic;
% PARALLEL LOOP
parfor i = 1:8
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('PARALLEL LOOP: %-31s %7d %8d %9d\n', name,numlabs,labindex,i)
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel loop
matlabpool close force;
fprintf('\n')
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('SERIAL REGION: hostname:%s\n', name)
fprintf('Elapsed time in parallel loop: %f\n', elapsed_time)
quit;
% FILENAME: myfunction.m
function result = myfunction ()
warning off all;
% SERIAL REGION
% Variable "result" is a "reduction" variable.
[c name] = system('hostname');
result = sprintf('SERIAL REGION: hostname:%s', name);
matlabpool open 4;
numlabs = matlabpool('size');
r = sprintf(' hostname numlabs labindex iteration');
result = strvcat(result,r);
r = sprintf(' ------------------------------- ------- -------- ---------');
result = strvcat(result,r);
tic;
% PARALLEL LOOP
parfor i = 1:8
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('PARALLEL LOOP: %-31s %7d %8d %9d', name,numlabs,labindex,i);
result = strvcat(result,r);
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel loop
matlabpool close force;
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('\nSERIAL REGION: hostname:%s', name);
result = strvcat(result,r);
r = sprintf('Elapsed time in parallel loop: %f', elapsed_time);
result = strvcat(result,r);
end
Prepare a wrapper script which receives and displays the result of myfunction.m. Use an appropriate filename, here named mywrapper.m:
% FILENAME: mywrapper.m result = myfunction(); disp(result) quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_mywrapper.sh /apps/rhel5/MATLAB/R2011b
On a front end, load modules for MATLAB and GCC. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC 4.6.2 is available on Radon. Verify the loaded versions. Set the default parallel configuration to your PBS configuration and quit MATLAB. Compile both the MATLAB script M-file mywrapper.m and the MATLAB function M-file myfunction.m:
$ module load matlab/R2011b
$ module load gcc/4.6.2
$ module list
1) matlab/R2011b 2) gcc/4.6.2
$ matlab -nodisplay
>> defaultParallelConfig('mypbsconfig');
>> quit
$ mcc -m mywrapper.m myfunction.m
$ mkdir test
$ cp mywrapper test
$ cp run_mywrapper.sh test
$ cp myjob.sub test
$ cd test
To obtain the name of the compute node which runs this compiler-generated script run_mywrapper.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows:
#!/bin/sh # script for execution of deployed applications # # Sets up the MCR environment for the current $ARCH and executes # the specified command. # exe_name=$0 exe_dir=`dirname "$0"` echo "run_mywrapper.sh" hostname echo "------------------------------------------" if [ "x$1" = "x" ]; then echo Usage: echo $0 \args else echo Setting up environment variables MCRROOT="$1" echo --- LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64; MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE} ; XAPPLRESDIR=${MCRROOT}/X11/app-defaults ; export LD_LIBRARY_PATH; export XAPPLRESDIR; echo LD_LIBRARY_PATH is ${LD_LIBRARY_PATH}; shift 1 "${exe_dir}"/myfunction $* fi exit
Submit the job as a single compute node with one processor core and request four DCS licenses:
$ qsub -l nodes=1:ppn=1,walltime=00:05:00,gres=MATLAB_Distrib_Comp_Server+4 myjob.sub
This job runs on a compute node myjob.sub which in turn submits the parallel job. The first job must run at least as long as the job with the parallel loop since it collects the results of the parallel job.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115292.radon-ad myusername standby myjob.sub 28611 1 1 -- 00:05 R 00:00
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115292.radon-ad myusername standby myjob.sub 28611 1 1 -- 00:05 R 00:00
115293.radon-ad myusername standby Job1 29390 4 4 -- 00:01 R 00:00
At first, job status shows one processor core (TSK) on one compute node (NDS). Then, job status shows that this job submits a second with four processor cores (TSK) on four compute nodes (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a021.rcac.purdue.edu
run_myfunction.sh
radon-a021.rcac.purdue.edu
------------------------------------------
Setting up environment variables
---
LD_LIBRARY_PATH is .:/apps/rhel5/MATLAB_R2010a/runtime/glnxa64:/apps/rhel5/MATLAB_R2010a/bin/glnxa64:/apps/rhel5/MATLAB_R2010a/sys/os/glnxa6
4:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/s
erver:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/client:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64
Warning: No display specified. You will not be able to display graphics on the screen.
SERIAL REGION: hostname:radon-a021.rcac.purdue.edu
Starting matlabpool using the 'mypbsconfig' configuration ... connected to 4 labs.
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a021.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a022.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a023.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a024.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a021.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a022.rcac.purdue.edu 4 1 3
PARALLEL LOOP: radon-a023.rcac.purdue.edu 4 1 8
PARALLEL LOOP: radon-a024.rcac.purdue.edu 4 1 7
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
SERIAL REGION: hostname:radon-a021.rcac.purdue.edu
Elapsed time in parallel loop: 5.125206
Output shows the name of the compute node (a021) that ran the job submission file myjob.sub and the compiler-generated script run_mywrapper.sh, the name of the compute node (a021) that ran the two serial regions, and the names of the four compute nodes (a021,a022,a023,a024) that ran the four scattered processor cores (four MATLAB labs) that processed the iterations of the parallel loop. The parfor loop does not set variable numlabs to the number of labs in the pool; nor does it give to each lab in the pool a unique value for variable labindex. The scrambled order of the iterations displayed in the output comes from the parallel nature of the parfor loop; labs process each iteration independently of the other iterations, so output from the iterations is in random order. You cannot assume that since there are four processor cores and eight iterations, each processor core processes two iterations of the parfor loop. While this example evenly distributed the iterations among the four labs, MATLAB may not use this schedule in other situations. Finally, output shows the time that the four labs spent running the eight iterations of the parfor loop. What proves that several processor cores are processing in parallel the eight iterations of the parfor loop is that the processing time decreases as the number of processor cores increases.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Secondly, increase the wall time of mypbsconfig by using the Configuration Manager in the Parallel menu to enter a new wall time in the property SubmitArguments. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of DCS licenses purchased. Increase the value of MATLAB_Distrib_Comp_Server in the qsub command to match the new size of the pool.
The eighth method of job submission uses the MATLAB Compiler mcc to compile a MATLAB function M-file with the 'local' configuration and submits the compiled file to a PBS queue.
This method is like the seventh method since it uses the same MATLAB M-files mywrapper.m and myfunction.m and the same job submission file myjob.sub. This method differs from the seventh since it uses the 'local' configuration (R2011a added support for compiling PCT code on the 'local' configuration).
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_myfunction.sh /apps/rhel5/MATLAB/R2011b
On a front end, load modules for MATLAB and GCC. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC 4.6.2 is available on Radon. Verify the versions loaded. Set the default parallel configuration to the 'local' configuration and compile the MATLAB function M-file:
$ module load matlab/R2011b
$ module load gcc/4.6.2
$ module list
1) matlab/R2011b 2) gcc/4.6.2
$ matlab -nodisplay
>> defaultParallelConfig('local');
>> quit
$ mcc -m mywrapper.m myfunction.m
To obtain the name of the compute node which runs this compiler-generated script run_mywrapper.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows:
#!/bin/sh # script for execution of deployed applications # # Sets up the MCR environment for the current $ARCH and executes # the specified command. # exe_name=$0 exe_dir=`dirname "$0"` echo "run_mywrapper.sh" hostname echo "------------------------------------------" if [ "x$1" = "x" ]; then echo Usage: echo $0 \args else echo Setting up environment variables MCRROOT="$1" echo --- LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64; MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE} ; XAPPLRESDIR=${MCRROOT}/X11/app-defaults ; export LD_LIBRARY_PATH; export XAPPLRESDIR; echo LD_LIBRARY_PATH is ${LD_LIBRARY_PATH}; shift 1 "${exe_dir}"/myfunction $* fi exit
Submit the job as a single compute node with four processor cores:
$ qsub -l nodes=1:ppn=4,walltime=00:05:00 myjob.sub
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115292.radon-ad myusername standby myjob.sub 28611 1 4 -- 00:05 R 00:00
Job status shows four processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a000.rcac.purdue.edu
run_myfunction.sh
radon-a000.rcac.purdue.edu
------------------------------------------
Setting up environment variables
---
LD_LIBRARY_PATH is .:/apps/rhel5/MATLAB_R2010a/runtime/glnxa64:/apps/rhel5/MATLAB_R2010a/bin/glnxa64:/apps/rhel5/MATLAB_R2010a/sys/os/glnxa6
4:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/s
erver:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/client:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64
Warning: No display specified. You will not be able to display graphics on the screen.
SERIAL REGION: hostname:radon-a000.rcac.purdue.edu
Starting matlabpool using the 'mypbsconfig' configuration ... connected to 4 labs.
hostname numlabs labindex iteration
------------------------------- ------- -------- ---------
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 2
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 4
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 5
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 6
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 1
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 3
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 8
PARALLEL LOOP: radon-a000.rcac.purdue.edu 4 1 7
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
SERIAL REGION: hostname:radon-a000.rcac.purdue.edu
Elapsed time in parallel loop: 5.126201
Output shows that processor cores on one compute node (a000) processed the entire job. The parfor loop does not set variable numlabs to the number of labs in the pool; nor does it give to each lab in the pool a unique value for variable labindex. The scrambled order of the iterations displayed in the output comes from the parallel nature of the parfor loop; labs process each iteration independently of the other iterations, so output from the iterations is in random order. You cannot assume that since there are four processor cores and eight iterations, each processor core processes two iterations of the parfor loop. While this example evenly distributed the iterations among the four labs, MATLAB may not use this schedule in other situations. Finally, output shows the time that the four labs spent running the eight iterations of the parfor loop. What proves that several processor cores are processing in parallel the eight iterations of the parfor loop is that the processing time decreases as the number of processor cores increases.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must equal the value in matlabpool open.
Specifying a MATLAB pool with 13 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow:
matlabpool open 13;
qsub -l nodes=1:ppn=13,walltime=00:01:00 myjob.sub
{Error using matlabpool (line 136)
Failed to open matlabpool. (For information in addition to the causing error,
validate the configuration 'local' in the Configurations Manager.)
Error in myfunction (line 10)
Error in mywrapper (line 3)
Caused by:
Error using distcomp.interactiveclient/start (line 88)
Failed to start matlabpool.
This is caused by:
You requested a minimum of 13 workers but only 12 workers are allowed with
the local scheduler.
}
distcomp:matlabpool:RunValidation
For more information about MATLAB Parallel Computing Toolbox:
The MATLAB Parallel Computing Toolbox (PCT) extends the MATLAB language with high-level, parallel-processing features such as parallel for loops, parallel regions, message passing, distributed arrays, and parallel numerical methods. PCT enables task and data parallelism on a multicore processor. It offers a shared-memory computing environment with a maximum of eight MATLAB workers (labs, threads; versions R2009a) and 12 workers (labs, threads; version R2011a) running on the local configuration in addition to your MATLAB client. Moreover, the MATLAB Distributed Computing Server (DCS) scales PCT applications up to the limit of your DCS licenses. This section illustrates the coarse-grained parallelism of a parallel region (spmd) in a pool job. Areas of application include SPMD (single program, multiple data) problems.
This section illustrates eight methods about submitting a small, parallel, MATLAB program with a parallel region (spmd statement) as a batch, MATLAB pool job to a PBS queue. The MATLAB program prints the name of the run host and shows the values of variables numlabs and labindex for each parallel region of the pool. The system function hostname returns two values: a numerical code and the name of the compute nodes that run the parallel regions.
The first method runs on a front end a MATLAB client which calls the MATLAB batch() function with a user-defined PBS configuration. Since it uses a PBS configuration, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out seven licenses: one MATLAB license for the client running on the front end, one PCT license, and five DCS licenses. One DCS license runs the batch session (including the two serial regions in the MATLAB M-code) while the other four run the spmd statement. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running function batch() and quitting MATLAB. The five DCS licenses remain active between running function batch() and job completion, even when the serial portion of the program is running. The DCS licenses do not appear in the output of function license().
The second method uses the PBS qsub command to submit to a compute node a MATLAB client which calls the MATLAB batch() function with the 'local' configuration. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. This job is completely off the front end.
The third method runs on a front end a MATLAB client which calls the MATLAB submit() function with the 'torque' scheduler. When you need more granularity in your submission, use submit(). Since it uses the 'torque' scheduler, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out seven licenses: one MATLAB license for the client running on the front end, one PCT license, and five DCS licenses. One DCS license runs the batch session (including the two serial regions in the MATLAB M-code) while the other four run the spmd statement. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. The five DCS licenses remain active between running function submit() and job completion, even when the serial portion of the program is running. The DCS licenses do not appear in the output of function license().
The fourth method uses the PBS qsub command to submit to a compute node a MATLAB client which calls the MATLAB submit() function with the 'local' scheduler. When you need more granularity in your submission, use submit(). Since it uses the 'local' scheduler, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. This job is completely off the front end.
The fifth method uses the PBS qsub command to submit to compute nodes a MATLAB client which interprets an M-file with a user-defined PBS configuration which scatters the MATLAB workers onto different compute nodes. Since it uses a PBS configuration, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out six licenses: one MATLAB license for the client running on the compute node, one PCT license, and four DCS licenses. Four DCS licenses run the four copies of the spmd statement. This job is completely off the front end.
The sixth method uses the PBS qsub command to submit to a compute node a MATLAB client which interprets an M-file with the 'local' configuration. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter; so, it requires and checks out one license: one MATLAB license for the client running on a compute node. The MATLAB license remains active between starting and quitting MATLAB. This job is completely off the front end.
The seventh method uses the MATLAB Compiler mcc and the default parallel configuration set to a PBS configuration to compile a MATLAB M-file and submits the compiled file to a PBS queue. It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. Since it uses a PBS configuration during compilation, this method, when executed, uses the MATLAB Distributed Computing Server; so, it requires and checks out four DCS licenses during the execution of the spmd statement. The serial portions of this job do not use a DCS license. This job is completely off the front end.
The eighth method uses the MATLAB Compiler mcc and the default parallel configuration set to the 'local' configuration to compile a MATLAB M-file and submits the compiled file to a PBS queue. It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. Since it uses the 'local' configuration, this method, when executed, uses no license. (Support for running compiled PCT code on the 'local' configuration was added in R2011a; this feature removes the need for DCS licenses in some cases.) This job is completely off the front end.
The MATLAB Compiler license is a lingering license. Using the compiler locks its license for at least 30 minutes. Each time you issue the compiler command, you reset the 30-minute timer. When running mcc at the MATLAB prompt, you hold the license as long as MATLAB remains open. To release the license, quit MATLAB. Quitting MATLAB is necessary to release the license, but not sufficient. When you quit MATLAB before the 30-minute timer runs out, then the license remains checked out for the remaining time. When running mcc at the Linux prompt, the 30-minute timer starts or restarts. The license manager tracks MATLAB licenses per user per cluster. If you run the MATLAB Compiler on two different clusters, you use two Compiler licenses. To minimize your license usage, run the Compiler on one cluster.
You can share your compiled program with colleagues who do not have MATLAB licenses. Your compiled program will take its required licenses within itself. So, you do not need to install any license file on your colleague's machine.
The following table summarizes MATLAB license usage:
| Method | Description | MATLAB | PCT | DCS | mcc | Limitations |
|---|---|---|---|---|---|---|
| 1 | batch() with user-defined PBS configuration | 1 | 1 | Matlabpool + 1 | 0 | number of MATLAB,PCT,DCS licenses purchased |
| 2 | batch() with 'local' configuration, qsub | 1 | 1 | 0 | 0 | local configuration with 8 (R2009a) and 12 (R2011a) workers |
| 3 | submit() with 'torque' scheduler | 1 | 1 | MaximumNumberOfWorkers | 0 | number of MATLAB,PCT,DCS licenses purchased |
| 4 | submit() with 'local' scheduler, qsub | 1 | 1 | 0 | 0 | local scheduler with 8 (R2009a) and 12 (R2011a) workers |
| 5 | qsub with user-defined PBS configuration | 1 | 1 | pool size | 0 | number of MATLAB,PCT,DCS licenses purchased |
| 6 | qsub with 'local' configuration | 1 | 1 | 0 | 0 | local configuration with 8 (R2009a) and 12 (R2011a) workers |
| 7 | Compiler with user-defined PBS configuration, qsub | 0 | 0 | pool size | 1 | number of DCS licenses purchased |
| 8 | Compiler with 'local' configuration, qsub | 0 | 0 | 0 | 1 | local configuration with 8 (R2009a) and 12 (R2011a) workers |
Prepare a MATLAB pool program in the form of a MATLAB script M-file and a MATLAB function M-file with appropriate filenames, here named myscript.m and myfunction.m:
% FILENAME: myscript.m
% SERIAL REGION
[c name] = system('hostname');
fprintf('SERIAL REGION: hostname:%s\n', name)
fprintf(' hostname numlabs labindex\n')
fprintf(' ------------------------------- ------- --------\n')
tic;
% PARALLEL REGION
spmd
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('PARALLEL REGION: %-31s %7d %8d\n', name,numlabs,labindex)
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel region
fprintf('\n')
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('SERIAL REGION: hostname:%s\n', name)
fprintf('Elapsed time in parallel region: %f\n', elapsed_time)
% FILENAME: myfunction.m
function result = myfunction ()
% SERIAL REGION
% Variable "r" is a "composite object."
[c name] = system('hostname');
result = sprintf('SERIAL REGION: hostname:%s', name);
r = sprintf(' hostname numlabs labindex');
result = strvcat(result,r);
r = sprintf(' ------------------------------- ------- --------');
result = strvcat(result,r);
tic;
% PARALLEL REGION
spmd
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('PARALLEL REGION: %-31s %7d %8d', name,numlabs,labindex);
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel region
for ndx=1:length(r) % concatenate composite object "r"
result = strvcat(result,r{ndx});
end
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('\nSERIAL REGION: hostname:%s', name);
result = strvcat(result,r);
r = sprintf('Elapsed time in parallel region: %f', elapsed_time);
result = strvcat(result,r);
end
Both M-files display the names of all compute nodes which run the job and the associated lab IDs. The script M-file uses fprintf() to display the results. The function M-file returns a single value which contains a concatenation of the results.
The execution of a pool job starts with a worker (batch session) executing the statements of the first serial region up to the spmd block, when it pauses. A set of workers (the pool) executes the spmd block. When they finish, the batch session resumes by executing the second serial region. The code displays the names of the compute nodes running the batch session and the worker pool.
The first method of job submission uses function batch() to offload from the MATLAB client running on the front end to a batch session running on a worker (compute node) the responsibilities of supervising another set of workers called the pool and accumulating the results. The batch session and the pool cooperate on processing a single program. Each worker in the pool has a unique identifier and can determine its behavior from that ID. The workers of the pool process simultaneously their respective portions of the workload of the parallel region so that the parallel region might run faster than the equivalent serial version. A pool size of N requires N+1 workers (processor cores). The source code is a MATLAB M-file (MATLAB function batch() accepts either a script M-file or a function M-file).
On the front end, load a MATLAB module and verify the version of the MATLAB module loaded. Run a MATLAB client. At the MATLAB prompt, view the current default parallel configuration; most likely, it is the 'local' configuration. Call MATLAB function batch() to make a four-lab pool on which to run the MATLAB code in the file myscript.m. In the call, replace the 'local' configuration by specifying your PBS configuration which you previously designed with the Configuration Manager (using the 'local' configuration at this point will run the spmd statement on the front end). This particular PBS configuration scatters the labs to different compute nodes to verify that a four-lab spmd statement actually uses five processor cores. Also, capture job output in the diary. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). After your job finishes, view results by viewing the diary. When you do not want to keep the files of your job, use MATLAB function destroy() to erase them from your current working directory. After this job finishes, the default parallel configuration remains the 'local' configuration.
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay
>> disp(defaultParallelConfig);
local
>> pjob=batch('myscript','Matlabpool',4,'Configuration','mypbsconfig','CaptureDiary',true);
>> !qstat -u myusername
>> disp(pjob.get('State'))
queued
>> disp(pjob.get('State'))
running
>> license('inuse')
distrib_computing_toolbox
matlab
>> disp(pjob.get('State'))
finished
>> license('inuse')
distrib_computing_toolbox
matlab
>> pjob.diary;
>> ls -l
>> pjob.destroy;
>> ls -l
>> disp(defaultParallelConfig);
local
>> quit;
$
The license name distrib_computing_toolbox refers to the Parallel Computing Toolbox.
View job status from qstat:
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115204.radon-a myusername standby Job1 5 5 -- 00:01 Q --
Job status shows that the MATLAB client submitted this job as five compute nodes (NDS), each with one processor core (TSK), and with a requested wall time of one minute. The call to function batch() specifies four labs to evaluate the parallel regions (spmd statement). The fifth lab runs the batch session, myscript.m, and accumulates the results. This arrangement explains the presence of five DCS licenses.
View job output from the diary:
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
hostname numlabs labindex
------------------------------- ------- --------
Lab 2:
PARALLEL REGION: radon-a637.rcac.purdue.edu 4 2
Lab 3:
PARALLEL REGION: radon-a636.rcac.purdue.edu 4 3
Lab 4:
PARALLEL REGION: radon-a635.rcac.purdue.edu 4 4
Lab 1:
PARALLEL REGION: radon-a638.rcac.purdue.edu 4 1
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Elapsed time in parallel region: 3.204323
Output shows that the nonnegative scalar integer which is the value of the property Matlabpool (4) defined the number of labs in the pool which processed the parallel region. Also, output shows the fifth lab which runs the batch session, which includes the two serial portions of the MATLAB pool program. Because the MATLAB pool requires the lab running the batch session in addition to N labs in the pool, there must be at least N+1 processor cores available on the cluster.
The MATLAB client scattered the five processor cores (five MATLAB labs) among five different compute nodes. A processor core of one compute node (a639) processed the batch session, including the two serial regions, while four processor cores of four different compute nodes (a638,a637,a636,a635) processed the parallel regions. Unlike a parfor loop, an spmd statement sets variable numlabs to the number of parallel regions (4) in the pool and assigns to each lab running a parallel region a unique value for variable labindex. There are four parallel regions, so there are four lab IDs. Output shows the four labs in scrambled order since the labs process each parallel region independently of the other parallel regions. Finally, output shows the time that the four labs spent running the four parallel regions. The time implies that the four labs run in parallel. Had they run in serial, then the time would have been at least eight seconds since each of the four spmd statements pauses for two seconds.
After calling the MATLAB function batch(), you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with the PBS command qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy().
$ module load matlab/R2011b
$ matlab -nodisplay
>> sched=findResource('scheduler','Configuration','mypbsconfig');
>> pjob=findJob(sched,'State','finished');
>> pjob.diary;
>> pjob.destroy;
>> quit;
$
To apply the first method of job submission to a function M-file, use one of the following sequences:
>> pjob=batch('myfunction','Matlabpool',4,'Configuration','mypbsconfig','CaptureDiary',true);
>> disp(pjob.get('State'))
finished
>> pjob.diary
>> pjob=batch('myfunction',1,{},'Matlabpool',4,'Configuration','mypbsconfig');
>> disp(pjob.get('State'))
finished
>> result=getAllOutputArguments(pjob);
>> result{1}
>> pjob=batch(@myfunction,1,{},'Matlabpool',4,'Configuration','mypbsconfig');
>> disp(pjob.get('State'))
finished
>> result=getAllOutputArguments(pjob);
>> result{1}
Function batch() offers an alternate mode of operation. If you exclude the property Configuration and its value from the list of arguments of function batch(), then batch() reads the default parallel configuration. Before calling batch(), set the default parallel configuration with your PBS configuration which you previously designed with the Configuration Manager. This change of the default parallel configuration is immediate and permanent; the PBS configuration remains the default parallel configuration during subsequent runs of MATLAB. MATLAB function batch() reads the default parallel configuration to discover your instructions about submission.
To scale up this method to handle a real application, use the Configuration Manager in the Parallel menu to increase the wall time to accommodate a longer running job. Enter a new wall time in the property SubmitArguments. Also, consider increasing the size of the MATLAB pool, the value of the property Matlabpool which appears as an argument in the call of function batch(). The maximum possible size of the pool is one less than the number of DCS licenses purchased.
The second method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar to the first method since it uses function batch() and the same MATLAB M-file, either myscript.m or myfunction.m. This method differs from the first because the MATLAB client runs on a compute node rather than on the front end and the client uses the MATLAB 'local' configuration rather than a user-defined PBS configuration.
Prepare a MATLAB script M-file that calls MATLAB function batch() which makes a four-lab pool on which to run the MATLAB code in the file myscript.m, which specifies the 'local' configuration and which captures job output in the diary. Use an appropriate filename, here named mylclbatch.m:
% FILENAME: mylclbatch.m
!echo "mylclbatch.m"
!hostname
pjob=batch('myscript','Matlabpool',4,'Configuration','local','CaptureDiary',true);
pjob.wait;
pjob.diary
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -r mylclbatch
Submit the job as a single compute node with six processor cores and request one PCT license:
$ qsub -l nodes=1:ppn=6,walltime=0:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
One processor core runs myjob.sub and mylclbatch.m; one processor core runs the two serial regions of the MATLAB M-file; four processor cores run the four copies of the parallel region.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
99025.radon-ad myusername standby myjob.sub 30197 1 6 -- 00:01 R 00:00
Job status shows six processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a639.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
mylclbatch.m
radon-a639.rcac.purdue.edu
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
hostname numlabs labindex
------------------------------- ------- --------
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 1
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 2
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 3
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 4
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Elapsed time in parallel region: 3.406318
Output shows that the nonnegative scalar integer which is the value of the property Matlabpool (4) defined the number of labs (4) in the pool which processed the parallel region. While output does not explicitly show the fifth lab, that lab runs the batch session, which includes the two serial portions of the MATLAB pool program. Because the MATLAB pool requires the lab running the batch session in addition to N labs in the pool, there must be at least N+1 processor cores available on the cluster.
Output shows that processor cores on one compute node (a639) processed the entire job. One processor core processed myjob.sub and mylclbatch.m. One processor core processed the batch session myscript.m, which includes the two serial regions, while four processor cores processed the parallel regions. Variable numlabs shows the total number of labs (4). Variable labindex shows the ID of an individual lab. There are four labs, so there are four lab IDs. Finally, output shows the time that the four labs spent running the four parallel regions. The time implies that the four labs run in parallel. Had they run in serial, then the time would have been at least eight seconds since each of the four spmd statements pauses for two seconds.
Any output written to standard error will appear in myjob.sub.emyjobid
To apply the second method of job submission to a function M-file, modify mylclbatch.m with one of the following sequences:
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the property Matlabpool which appears as an argument in the call of function batch(). The maximum possible size of the pool is one less than the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be two greater than the value of Matlabpool. Specifying a MATLAB pool with 12 labs means a total of 13 workers. This exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow: The third method of job submission uses function submit() to offload from the MATLAB client running on the front end to a batch session running on a worker (compute node) the responsibilities of supervising another set of workers called the pool and accumulating the results. The batch session and the pool cooperate on processing a single program. Each worker in the pool has a unique identifier and can determine its behavior from that ID. The workers of the pool process simultaneously their respective portions of the workload of the parallel region so that the parallel region might run faster than the equivalent serial version. A pool size of N requires N+1 workers (processor cores). The source code is a MATLAB function M-file (MATLAB function submit() accepts only a function M-file). Prepare a MATLAB script M-file which finds the MATLAB 'torque' scheduler, which specifies scattering five processor cores to five different compute nodes and one minute of wall time for this short job and the number of PCT and DCS licenses, which defines a MATLAB pool job with five workers (labs) and one task, and which calls MATLAB function submit(). Use an appropriate filename, here named mypbssubmit.m: On a front end, load a MATLAB module and verify the version of the MATLAB module loaded. Run a MATLAB client. At the MATLAB prompt, run mypbssubmit. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). After your job finishes, get all output arguments into a cell array and display the results. When you do not want to keep the files of a job, use MATLAB function destroy() to erase them from your current working directory. View job status from qstat: Job status shows that the MATLAB client submitted this job as five compute nodes (NDS), each with one processor core (TSK), and the requested wall time of one minute. The call to function submit() specifies five labs as the minimum and maximum number of labs of the MATLAB pool. Four labs evaluate the four parallel regions. The fifth lab runs the batch session myfunction.m, including the two serial regions, and accumulates the results. This arrangement explains the presence of five DCS licenses. View job output: Output shows that the nonnegative scalar integer which is the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers (5) defined the number of labs in the pool (4) which processed the parallel region. Also, output shows the fifth lab which runs the batch session, which includes the two serial portions of the MATLAB pool program. Because the MATLAB pool requires N labs including the lab running the batch session, there must be at least N processor cores available on the cluster. The MATLAB client scattered the five processor cores (five MATLAB labs) among five different compute nodes. A processor core of one compute node (a636) processed the batch session, including the two serial regions, while four processor cores of four different compute nodes (a634,a633,a632,a631) processed the parallel regions. Unlike a parfor loop, an spmd statement sets variable numlabs to the number of parallel regions (4) in the pool and assigns to each lab running a parallel region a unique value for variable labindex. There are four parallel regions, so there are four lab IDs. Finally, output shows the time that the four labs spent running the four parallel regions. The time implies that the four labs run in parallel. Had they run in serial, then the time would have been at least eight seconds since each of the four spmd statements pauses for two seconds. After the "FINISHED SUBMITTING" message, you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with PBS command qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy(). For practice, modify mypbssubmit.m to rerun this example as a single compute node with five processor cores: View job status from qstat: The MATLAB client submitted this job as a single compute node (NDS) with five processor cores (TSK). The lab that runs the batch session and the four labs that run the spmd statement reside on the same compute node. View job output: Output shows that processor cores of one compute node (a639) processed the entire job. Function submit() offers an alternate mode of operation. It can read your PBS configuration made previously with the Configuration Manager. Here is an alternate version of mypbssubmit.m which relies on mypbsconfig for the several details of job submission: To scale up this method to handle a real application, increase the wall time in mypbssubmit.m to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers which appear as arguments in the calls of function set(). The maximum possible size of the pool is the number of DCS licenses purchased. Finally, increase the value of MATLAB_Distrib_Comp_Server to match the value of MinimumNumberOfWorkers. The fourth method of job submission uses the PBS qsub command to submit a pool job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node. This method is similar the third method since it uses function submit() and the same MATLAB function M-file myfunction.m (MATLAB function submit() accepts only a function M-file). This method differs from the third because the MATLAB client runs on a compute node rather than on the front end and the client uses the MATLAB 'local' scheduler rather than the MATLAB 'torque' scheduler. Prepare a MATLAB script M-file which finds the MATLAB 'local' scheduler, which defines a MATLAB pool job with five workers (labs) and one task, and which calls MATLAB function submit(). Use an appropriate filename, here named mylclsubmit.m: Prepare a job submission file with an appropriate filename, here named myjob.sub: Submit the job as a single compute node with six processor cores and request one PCT license: One processor core runs myjob.sub and mylclsubmit.m; one processor core runs the two serial regions of the batch session; four processor cores run the four copies of the parallel region. View job status: Job status shows six processor cores (TSK) on one compute node (NDS). View results in the file for all standard output, myjob.sub.omyjobid: Output shows that the nonnegative scalar integer which is the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers (5) defined the number of labs in the pool (4) which processed the parallel region. While output does not explicitly show the fifth lab, that lab runs the batch session, which includes the two serial portions of the MATLAB pool program. Because the MATLAB pool requires N workers including the worker running the batch session, there must be at least N processor cores available on the cluster. The processor cores of one compute node (a639) processed the entire job. One processor core processed myjob.sub and mylclbatch.m. One processor core processed the batch session myfunction.m, which includes the two serial regions, while four processor cores processed the parallel regions. Variable numlabs shows the total number of labs, which in this case is four. Variable labindex shows the ID of an individual lab. There are four labs, so there are four lab IDs. Finally, output shows the time that the four labs spent running the four parallel regions. The time implies that the four labs run in parallel. Had they run in serial, then the time would have been at least eight seconds since each of the four spmd statements pauses for two seconds. Any output written to standard error will appear in myjob.sub.emyjobid. To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers which appear as arguments in the calls of function set(). The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be one greater than the value of MaximumNumberOfWorkers. Specifying 13 workers to achieve a MATLAB pool with 12 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow: The fifth method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node. This method is similar to the first and third methods since it uses either a MATLAB script M-file or a MATLAB function M-file. Like the first and third methods, this method uses a user-defined PBS configuration. It differs from the other two methods because the script is slightly different. Two MATLAB statements specify a MATLAB pool since this method uses neither batch() nor submit() to make the MATLAB pool. Also, the MATLAB script M-file must quit at the end of the batch process. Another significant difference: This method uses one fewer DCS license than the first and third methods. Modify the MATLAB script M-file myscript.m with matlabpool and quit statements or the MATLAB function M-file myfunction.m with matlabpool statements: Prepare a job submission file with an appropriate filename, here named myjob.sub. Run with the name of either the script M-file or the function M-file: Run MATLAB to set the default parallel configuration to your PBS configuration: Submit the job as a single compute node with one processor core and request one PCT license and four DCS licenses: This job submission causes a second job submission. View job status: At first, job status shows one processor core (TSK) on one compute node (NDS). Then, job status shows four processor cores (TSK) on four compute nodes (NDS). View results in the file for all standard output, myjob.sub.omyjobid: Output shows the name of one compute node (a639) that processed the job submission file myjob.sub and the two serial regions. The job submission scattered four processor cores (four MATLAB labs) among four different compute nodes (a639,a638,a637,a636) that processed the four parallel regions. Unlike a parfor loop, an spmd statement sets variable numlabs to the number of parallel regions (4) in the pool and assigns to each lab running a parallel region a unique value for variable labindex. There are four parallel regions, so there are four lab IDs. Output shows the four labs in scrambled order since the labs process each parallel region independently of the other parallel regions. Finally, output shows the time that the four labs spent running the four parallel regions. The time implies that the four labs run in parallel. Had they run in serial, then the time would have been at least eight seconds since each of the four spmd statements pauses for two seconds. Any output written to standard error will appear in myjob.sub.emyjobid. To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Secondly, increase the wall time of mypbsconfig by using the Configuration Manager in the Parallel menu to enter a new wall time in the property SubmitArguments. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of DCS licenses purchased. The sixth method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node. This method is like the fifth method since it uses the same MATLAB M-files and the same job submission file myjob.sub. This method differs from the fifth since it uses the MATLAB 'local' configuration rather than a PBS configuration. Run MATLAB to set the default parallel configuration to the MATLAB 'local' configuration: Submit the job as a single compute node with five processor cores and request one PCT license: View job status: Job status shows five processor cores (TSK) on one compute node (NDS). View results in the file for all standard output, myjob.sub.omyjobid: Output shows that processor cores on one compute node (a639) processed the entire job. Output also shows that a "matlabpool using the 'local' configuration" is connected to four MATLAB labs. Unlike a parfor loop, an spmd statement sets variable numlabs to the number of labs in the pool and assigns to each lab in the pool a unique value for variable labindex. Finally, output shows the time that the four labs spent running the four parallel regions. The time implies that the four labs run in parallel. Had they run in serial, then the time would have been at least eight seconds since each lab pauses for two seconds. Any output written to standard error will appear in myjob.sub.emyjobid. To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be one greater than the value in matlabpool open. Specifying a MATLAB pool with 13 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow: The seventh method of job submission uses the MATLAB Compiler mcc to compile a MATLAB M-file with a PBS configuration and submits the compiled file to a PBS queue. This method is similar to the first and third methods since it uses either a MATLAB script M-file or a MATLAB function M-file. Like the first and third methods, this method uses a user-defined PBS configuration. It differs from the other two methods because the script is slightly different. Two MATLAB statements specify a MATLAB pool since this method uses neither batch() nor submit() to make the MATLAB pool. Also, the MATLAB script M-file must quit at the end of the batch process. Another significant difference: This method uses one fewer DCS license than the first and third methods. Modify the MATLAB script M-file myscript.m with matlabpool and quit statements or the MATLAB function M-file myfunction.m with matlabpool statements: Prepare a job submission file with an appropriate filename, here named myjob.sub: On a front end, load modules for MATLAB and GCC. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC 4.6.2 is available on Radon. Verify the loaded versions. Set the default parallel configuration to your PBS configuration and quit MATLAB. Compile the MATLAB script M-file myscript.m: To obtain the name of the compute node which runs this compiler-generated script run_myscript.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows: Submit the job as a single compute node with one processor core and request four DCS licenses: This job runs on a compute node myjob.sub which in turn submits the parallel job. The first job must run at least as long as the job with the parallel regon since it collects the results of the parallel job. View job status: At first, job status shows one processor core (TSK) on one compute node (NDS). Then, job status shows that this job submits a second with four processor cores (TSK) on four compute nodes (NDS). View results in the file for all standard output, myjob.sub.omyjobid: Output shows the name of the compute node (a639) that ran the job submission file myjob.sub and the compiler-generated script run_myscript.sh, the name of the compute node (a639) that ran the two serial regions, and the names of the four compute nodes (a639,a638,a636,a633) that ran the four scattered processor cores (four MATLAB labs) that processed the four parallel regions. Unlike a parfor loop, an spmd statement sets variable numlabs to the number of parallel regions (4) in the pool and assigns to each lab running a parallel region a unique value for variable labindex. There are four parallel regions, so there are four lab IDs. Output shows the four labs in scrambled order since the labs process each parallel region independently of the other parallel regions. Finally, output shows the time that the four labs spent running the four parallel regions. The time implies that the four labs run in parallel. Had they run in serial, then the time would have been at least eight seconds since each of the four spmd statements pauses for two seconds. Any output written to standard error will appear in myjob.sub.emyjobid. To apply this method of job submission to a MATLAB function M-file, prepare a wrapper script which receives and displays the result of myfunction.m. Use an appropriate filename, here named mywrapper.m: Prepare a job submission file with an appropriate filename, here named myjob.sub: Compile both the wrapper and the function then submit: To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running parallel loop. Secondly, increase the wall time of mypbsconfig by using the Configuration Manager in the Parallel menu to enter a new wall time in the property SubmitArguments. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of DCS licenses acquired. Increase the value of MATLAB_Distrib_Comp_Server in the qsub command to match the new size of the pool. The eighth method of job submission uses the MATLAB Compiler mcc to compile a MATLAB M-file with the 'local' configuration and submits the compiled file to a PBS queue. This method is like the seventh method since it uses the same MATLAB M-files and the same job submission file myjob.sub. This method differs from the seventh since it uses the MATLAB 'local' configuration (R2011a added support for compiling PCT code on the 'local' configuration). On a front end, load modules for MATLAB and GCC. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC 4.6.2 is available on Radon. Verify the versions loaded. Set the default parallel configuration to the 'local' configuration and compile the MATLAB script M-file: To obtain the name of the compute node which runs this compiler-generated script run_myscript.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows: Submit the job as a single compute node with four processor cores: View job status: Job status shows four processor cores (TSK) on one compute node (NDS). View results in the file for all standard output, myjob.sub.omyjobid: Output shows that processor cores on one compute node (a299) processed the entire job. Unlike a parfor loop, an spmd statement sets variable numlabs to the number of parallel regions (4) in the pool and assigns to each lab running a parallel region a unique value for variable labindex. There are four parallel regions, so there are four lab IDs. Output shows the four labs in sequential order even though the labs process each parallel region independently of the other parallel regions. Finally, output shows the time that the four labs spent running the four parallel regions. The time implies that the four labs run in parallel. Had they run in serial, then the time would have been at least eight seconds since each of the four spmd statements pauses for two seconds. Any output written to standard error will appear in myjob.sub.emyjobid. To apply the eighth method of job submission to a MATLAB function M-file, prepare a wrapper script which receives and displays the result of myfunction.m. Use an appropriate filename, here named mywrapper.m: Prepare a job submission file with an appropriate filename, here named myjob.sub: Compile both the wrapper and the function then submit: To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must equal the value in the matlabpool open statement. Specifying a MATLAB pool with 13 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow: For more information about MATLAB Parallel Computing Toolbox: The MATLAB Parallel Computing Toolbox (PCT) extends the MATLAB language with high-level, parallel-processing features such as parallel for loops, parallel regions, message passing, distributed arrays, and parallel numerical methods. PCT enables task and data parallelism on a multicore processor. It offers a shared-memory computing environment with a maximum of eight MATLAB workers (labs, threads; versions R2009a) and 12 workers (labs, threads; version R2011a) running on the local configuration in addition to your MATLAB client. Moreover, the MATLAB Distributed Computing Server (DCS) scales PCT applications up to the limit of your DCS licenses. This section illustrates the coarse-grained parallelism of several independent tasks in a distributed job. The tasks of a distributed job may be identical or similar, but can be completely different from one another. The tasks do not communicate with each other. They need not run simultaneously. A multi-core compute node might run one task or several tasks in parallel and/or in succession. Areas of application include embarrassingly parallel computations, such as parameter sweeps. This section illustrates two methods about submitting a small MATLAB distributed job with several identical but independent tasks to a PBS queue. The tasks display the names of the compute nodes running the tasks. The system function hostname returns two values: a numerical code and the name of the compute node that runs the command. Also, there is an explanation that illustrates what happens when the number of tasks exceeds the number of DCS licenses. The first method runs on a front end a MATLAB client which calls the MATLAB submit() function with the 'torque' scheduler. When you need more granularity in your submission, use submit(). Since it uses the 'torque' scheduler, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out 21 licenses: one MATLAB license for the client running on the front end, one PCT license, and 19 instances of the DCS licenses. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. The DCS licenses remain active between running function submit() and job completion. The DCS licenses do not appear in the output of function license(). The second method uses the PBS qsub command to submit a job to a PBS queue. This method runs on a compute node a MATLAB client which uses the 'local' configuration to run all tasks. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. This job is completely off the front end. The following table summarizes MATLAB license usage: For the first method of job submission, prepare a MATLAB script M-file which finds the 'torque' scheduler, defines a distributed job and 11 tasks, and calls MATLAB function submit(). Use an appropriate filename, here named mypbssubmit.m: On a front end, load the MATLAB module and verify the version of the MATLAB module loaded. Run a MATLAB client. At the MATLAB prompt, verify what the default parallel configuration is. Either the 'local' configuration or a PBS configuration will work and will yield the same result. Then run mypbssubmit.m. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). After your job finishes, get all output arguments into a cell array and display the results. When you do not want to keep the files of a job, use MATLAB function destroy() to erase them from your current working directory: View job status from qstat: View the results of the tasks (in an abbreviated form): Several compute nodes participated in the processing. When a task does not get a license, it does not run. Suppose Task8 did not get a license. View the log file for Task8, Job1.Task8.log: The log file for Task8 reads, "License checkout failed." Perhaps other tasks (including other users' tasks) had already taken all available licenses. After the "FINISHED SUBMITTING" message, you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with PBS qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy(), so you may rerun MATLAB and find your job: The second method of submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node. This method is similar to the first method since it uses a MATLAB script M-file. It differs from the first method because the script is slightly different. This script can wait for all tasks to finish. Also, this method differs from the first because the MATLAB client runs on a compute node rather than on the front end. Either the 'local' configuration or a PBS configuration will work and will yield the same result. Prepare a MATLAB script M-file with an appropriate filename, here named mylclsubmit.m: Prepare a job submission file with an appropriate filename, here named myjob.sub: Submit the job with either the 'local' configuration or a PBS configuration as the default parallel configuration; the result will be the same. Request one license from the Parallel Computing Toolbox (PCT): View job status: View results (in an abbreviated form) in the file for all standard output, myjob.sub.omyjobid: Output shows that one compute node (a002) processed all tasks of the distributed job. Any output written to standard error will appear in myjob.sub.emyjobid. For more information about distributed jobs: The MATLAB Parallel Computing Toolbox (PCT) offers a parallel job via the MATLAB Distributed Computing Server (DCS). The tasks of a parallel job are identical, run simultaneously on several MATLAB workers (labs), and communicate with each other. PCT offers a distributed-memory computing environment with a maximum of eight MATLAB workers (labs, MPI ranks; versions R2009a) and 12 workers (labs, MPI ranks; version R2011a) running on the local configuration. Moreover, the MATLAB Distributed Computing Server (DCS) scales up PCT applications to the limit of your DCS licenses. This section illustrates an MPI-like program. Areas of application include distributed arrays and message passing. This section illustrates ten methods about submitting a small, MATLAB parallel job with four workers running one MPI-like task to a PBS queue. The MATLAB program broadcasts an integer, which might be the number of slices of a numerical integration, to four workers and gathers the names of the compute nodes running the workers and the lab IDs of the workers. The system function hostname returns two values: a numerical code and the name of the compute nodes that run the program. The first method runs on a front end a MATLAB client which calls the MATLAB batch() function with a user-defined PBS configuration. Since it uses a PBS configuration, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out seven licenses: one MATLAB license for the client running on the front end, one PCT license, and five DCS licenses. One DCS license runs the batch session (including the two serial regions in the MATLAB M-code) while the other four run the spmd statement. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running function batch() and quitting MATLAB. The five DCS licenses remain active between running function batch() and job completion, even when the serial portion of the program is running. The DCS licenses do not appear in the output of function license(). The second method uses the PBS qsub command to submit to a compute node a MATLAB client which calls the MATLAB batch() function with the 'local' configuration. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. This job is completely off the front end. The third method runs on a front end a MATLAB client which calls the MATLAB submit() function with the 'torque' scheduler. When you need more granularity in your submission, use submit(). Since it uses the 'torque' scheduler, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out six licenses: one MATLAB license for the client running on the front end, one PCT license, and four DCS licenses. One DCS license runs the batch session while the other four run the parallel code. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. The five DCS licenses remain active between running function submit() and job completion, even when the serial portion of the program is running. The DCS licenses do not appear in the output of function license(). The fourth method uses the PBS qsub command to submit to a compute node a MATLAB client which calls the MATLAB submit() function with the 'local' configuration. When you need more granularity in your submission, use submit(). Since it uses the 'local' scheduler, this method, when executed, uses the MATLAB interpreter and the Parallel Computing Toolbox; so, it requires and checks out two licenses: one MATLAB license for the client running on the compute node and one PCT license. The MATLAB license remains active between starting and quitting MATLAB. The PCT license remains active between running a PCT function, such as findResource(), and quitting MATLAB. This job is completely off the front end. The fifth method uses the PBS qsub command to submit to compute nodes a MATLAB client which interprets an M-file with a user-defined PBS configuration which scatters the MATLAB workers onto different compute nodes. Since it uses a PBS configuration, this method, when executed, uses the MATLAB interpreter, the Parallel Computing Toolbox, and the Distributed Computing Server; so, it requires and checks out six licenses: one MATLAB license for the client running on the compute node, one PCT license, and four DCS licenses. Four DCS licenses run the four copies of the parallel job. This job is completely off the front end. The sixth method uses the PBS qsub command to submit to a compute node a MATLAB client which interprets an M-file with the 'local' configuration. Since it uses the 'local' configuration, this method, when executed, uses the MATLAB interpreter; so, it requires and checks out one license: one MATLAB license for the client running on the compute node. The MATLAB license remains active between starting and quitting MATLAB. This job is completely off the front end. The seventh method uses the MATLAB Compiler mcc and the default parallel configuration set to a PBS configuration to compile a MATLAB M-file and submits the compiled file to a PBS queue. It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. Since it uses a PBS configuration during compilation, this method, when executed, uses the MATLAB Distributed Computing Server; so, it requires and checks out four DCS licenses during the execution of the spmd statement. This job is completely off the front end. The eighth method uses the MATLAB Compiler mcc and the default parallel configuration set to the 'local' configuration to compile a MATLAB M-file and submits the compiled file to a PBS queue (support for running compiled PCT code on the 'local' configuration was added in R2011a; this feature removes the need for DCS licenses in some cases). It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. Since it uses the 'local' configuration, this method, when executed, uses no license. This job is completely off the front end. The ninth method uses the MATLAB Compiler mcc, qsub, and your PBS configuration to compile a parallel job. This method is the third method compiled. This method compiles a script M-file which contains the specifications of job submission plus the code of the parallel job in a function M-file. It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. Since it uses a PBS configuration during compilation, this method, when executed, uses the MATLAB Distributed Computing Server; so, it requires and checks out four DCS licenses during the execution of the spmd statement. This job is completely off the front end. The tenth method uses the MATLAB Compiler mcc, qsub, and the MATLAB 'local' configuration to compile a parallel job. This method is the fourth method compiled. This method compiles a script M-file which contains the specifications of job submission plus the code of the parallel job in a function M-file. It requires and checks out one MATLAB Compiler license. This license remains checked out for 30 minutes after the compilation completes. Since it uses a PBS configuration during compilation, this method, when executed, uses the MATLAB Distributed Computing Server; so, it requires and checks out four DCS licenses during the execution of the spmd statement. This job is completely off the front end. The MATLAB Compiler license is a lingering license. Using the compiler locks its license for at least 30 minutes. Each time you issue the compiler command, you reset the 30-minute timer. When running mcc at the MATLAB prompt, you hold the license as long as MATLAB remains open. To release the license, quit MATLAB. Quitting MATLAB is necessary to release the license, but not sufficient. When you quit MATLAB before the 30-minute timer runs out, then the license remains checked out for the remaining time. When running mcc at the Linux prompt, the 30-minute timer starts or restarts. The license manager tracks MATLAB licenses per user per cluster. If you run the MATLAB Compiler on two different clusters, you use two Compiler licenses. To minimize your license usage, run the Compiler on one cluster. You can share your compiled program with colleagues who do not have MATLAB licenses. Your compiled program will take its required licenses within itself. So, you do not need to install any license file on your colleague's machine. The following table summarizes MATLAB license usage: Prepare a MATLAB parallel program in the form of a MATLAB script M-file and a MATLAB function M-file with appropriate filenames, here named myscript.m and myfunction.m: Both M-files display the names of all compute nodes which run the job and the associated lab IDs. The script M-file uses fprintf() to display the results. The function M-file returns a single value which contains a concatenation of the results. The execution of a parallel job has all copies of the single task running concurrently and perhaps also communicating with each other. The first method of job submission uses function batch() to offload from the MATLAB client running on the front end to a batch session running on a worker (compute node) the responsibilities of supervising another set of workers called the pool and accumulating the results. Since function batch() accepts only MATLAB pool jobs, you must convert the parallel job to a pool job by surrounding the code with an spmd statement (MATLAB function batch() accepts either a script M-file or a function M-file). Use appropriate filenames, here named myscript.m and myfunction.m: On the front end, load a MATLAB module and verify the version of the MATLAB module loaded. Run a MATLAB client. At the MATLAB prompt, view the current default parallel configuration; most likely, it is the 'local' configuration. Call MATLAB function batch() to make a four-lab pool on which to run the MATLAB code in the file myscript.m. In the call, replace the 'local' configuration by specifying your PBS configuration which you previously designed with the Configuration Manager (using the 'local' configuration at this point will run the spmd statement on the front end). This particular PBS configuration scatters the labs to different compute nodes to verify that a four-lab spmd statement actually uses five processor cores. Also, capture job output in the diary. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). After your job finishes, view results by viewing the diary. When you do not want to keep the files of your job, use MATLAB function destroy() to erase them from your current working directory. After this job finishes, the default parallel configuration remains the 'local' configuration. The license name distrib_computing_toolbox refers to the Parallel Computing Toolbox. View job status from qstat: Job status shows that the MATLAB client submitted this job as five compute nodes (NDS), each with one processor core (TSK), and with a wall time of one minute. The call to function batch() specifies four labs to evaluate the four labs of the parallel regions (spmd statement). The fifth lab runs the batch session, myscript.m, and accumulates the results. This arrangement explains the need for five DCS licenses. View job output from the diary: Output shows that the nonnegative scalar integer which is the value of the property Matlabpool (4) defined the number of labs in the pool which processed the parallel region. The fifth worker, which runs the batch session, does not appear in the output since there is no serial region. Because the MATLAB pool requires the lab running the batch session in addition to N labs in the pool, there must be at least N+1 processor cores available on the cluster. The MATLAB client scattered the five processor cores (five MATLAB labs) among five different compute nodes. A processor core of one compute node processed the batch session while four processor cores of four different compute nodes (a001,a002,a003,a004) processed the parallel regions. Output also shows that the value of variable numlabs is the number of labs (4) and that the program assigned to each lab a unique value for variable labindex. There are four labs, so there are four lab IDs. Each lab received the broadcast value: 1,000. Function gcat() collected in Lab 1 and from each parallel region the name of the compute node. After calling the MATLAB function batch(), you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with the PBS command qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy(). To apply the first method of job submission to a function M-file, use one of the following sequences:
Function batch() offers an alternate mode of operation. If you exclude the property Configuration and its value from the list of arguments of function batch(), then batch() reads the default parallel configuration. Before calling batch(), set the default parallel configuration with your PBS configuration which you previously designed with the Configuration Manager. This change of the default parallel configuration is immediate and permanent; the PBS configuration remains the default parallel configuration during subsequent runs of MATLAB. MATLAB function batch() reads the default parallel configuration to discover your instructions about submission. To scale up this method to handle a real application, use the Configuration Manager in the Parallel menu to increase the wall time to accommodate a longer running job. Enter a new wall time in the property SubmitArguments. Also, consider increasing the size of the MATLAB pool, the value of the property Matlabpool which appears as an argument in the call of function batch(). The maximum possible size of the pool is one less than the number of DCS licenses purchased. The second method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node. This method is similar to the first method since it uses function batch() and the same MATLAB M-file, either myscript.m or myfunction.m, modified from a parallel job to a pool job with an spmd statement. This method differs from the first because the MATLAB client runs on a compute node rather than on the front end and the client uses the MATLAB 'local' configuration rather than a user-defined PBS configuration. Prepare a MATLAB script M-file that calls MATLAB function batch() which makes a four-lab pool on which to run the MATLAB code in the file myscript.m, which specifies the 'local' configuration, and which captures job output in the diary. Use an appropriate filename, here named mylclbatch.m: Prepare a job submission file with an appropriate filename, here named myjob.sub: Submit the job as a single compute node with six processor cores and request one PCT license: One processor core runs myjob.sub and mylclbatch.m; four processor cores run the four copies of the parallel region. View job status: Job status shows six processor cores (TSK) on one compute node (NDS). View results in the file for all standard output, myjob.sub.omyjobid: Output shows that the nonnegative scalar integer which is the value of the property Matlabpool (4) defined the number of labs (4) in the pool which processed the parallel region. While output does not explicitly show the fifth worker, that worker runs the batch session myscript.m. Because the MATLAB pool requires the lab running the batch session in addition to the N labs in the pool, there must be at least N+1 processor cores available on the compute node. Output shows that processor cores on one compute node (a017) processed the entire job. One processor core processed myjob.sub and mylclbatch.m. One processor core processed the batch session while four processor cores processed the parallel regions. Variable numlabs shows the total number of labs (4). Variable labindex shows the ID of an individual lab. There are four labs, so there are four lab IDs. Any output written to standard error will appear in myjob.sub.emyjobid. To apply the second method of job submission to a function M-file, modify mylclbatch.m with one of the following sequences:
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the property Matlabpool which appears as an argument in the call of function batch(). The maximum possible size of the pool is one less than the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be two greater than the value of Matlabpool. Specifying a MATLAB pool with 12 (R2011a) labs means a total of 13 workers. This exceeds the 'local' configuration. The relevant lines of code and the error follow: The third method of job submission uses function submit() to submit a parallel job to a PBS queue. Function submit() can accept a parallel job in the form of a MATLAB function M-file myfunction.m (MATLAB function submit() accepts only a function M-file). Prepare a MATLAB script M-file which finds the MATLAB 'torque' scheduler, which specifies scattering four processor cores to four different compute nodes and one minute of wall time for this short job and the number of PCT and DCS licenses, which defines a MATLAB parallel job with four workers (labs) and one task, and which calls MATLAB function submit(). Use an appropriate filename, here named mypbssubmit.m: On a front end, load a MATLAB module and verify the version of the MATLAB module loaded. Run a MATLAB client. At the MATLAB prompt, run mypbssubmit. To monitor when your job finishes, run the PBS command qstat in a shell (exclamation point "!") or MATLAB function get(). After your job finishes, get all output arguments into a cell array and display the results. When you do not want to keep the files of a job, use MATLAB function destroy() to erase them from your current working directory. View job status from qstat: Job status shows that the MATLAB client submitted this job as four compute nodes (NDS), each with one processor core (TSK), and the requested wall time of one minute. The call to function submit() specifies four labs as the minimum and maximum number of labs of the MATLAB pool. Four labs evaluate the four copies of the parallel job. View job output:
pjob=batch('myfunction','Matlabpool',4,'Configuration','local','CaptureDiary',true);
pjob.wait;
pjob.diary
pjob=batch('myfunction',1,{},'Matlabpool',4,'Configuration','local');
pjob.wait;
result = getAllOutputArguments(pjob);
result{1}
pjob=batch(@myfunction,1,{},'Matlabpool',4,'Configuration','local');
pjob.wait;
result = getAllOutputArguments(pjob);
result{1}
pjob=batch('myscript','Matlabpool',12,'Configuration','local','CaptureDiary',true);
$ qsub -l nodes=1:ppn=14,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using batch (line 172)
You requested a minimum of 13 workers but only 12 workers are allowed with the
local scheduler.
Error in mylclbatch (line 6)
pjob=batch('myscript','Matlabpool',12,'Configuration','local','CaptureDiary',true);}
% FILENAME: mypbssubmit.m
sched = findResource('scheduler', 'type', 'torque');
set(sched,'ClusterMatlabRoot',matlabroot);
set(sched,'SubmitArguments','-l walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+5');
pjob = createMatlabPoolJob(sched);
set(pjob,'MinimumNumberOfWorkers',5);
set(pjob,'MaximumNumberOfWorkers',5);
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay
>> mypbssubmit
FINISHED SUBMITTING
>> !qstat -u myusername
>> disp(pjob.get('State'))
queued
>> disp(pjob.get('State'))
running
>> disp(pjob.get('State'))
finished
>> result = getAllOutputArguments(pjob)
result =
[9x68 char]
>> result{1}
>> ls -l
>> job.destroy;
>> ls -l
>> quit
$
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115265.radon-a myusername standby Job1 -- 5 5 -- 00:01 Q --
SERIAL REGION: hostname:radon-a636.rcac.purdue.edu
hostname numlabs labindex
------------------------------- ------- --------
PARALLEL REGION: radon-a634.rcac.purdue.edu 4 1
PARALLEL REGION: radon-a633.rcac.purdue.edu 4 2
PARALLEL REGION: radon-a632.rcac.purdue.edu 4 3
PARALLEL REGION: radon-a631.rcac.purdue.edu 4 4
SERIAL REGION: hostname:radon-a636.rcac.purdue.edu
Elapsed time in parallel region: 2.878010
$ module load matlab/R2011b
$ matlab -nodisplay
>> sched=findResource('scheduler','type','torque');
>> pjob=findJob(sched,'State','finished');
>> result=getAllOutputArguments(pjob);
>> result{1}
>> job.destroy;
>> quit
$
set(sched,'SubmitArguments','-l nodes=1:ppn=5,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+5');
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115273.radon-a myusername standby Job1 -- 1 5 -- 00:01 Q --
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
hostname numlabs labindex
------------------------------- ------- --------
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 1
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 2
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 3
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 4
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Elapsed time in parallel region: 2.964572
% FILENAME: mypbssubmit.m
sched = findResource('scheduler','Configuration','mypbsconfig');
pjob = createMatlabPoolJob(sched);
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
% FILENAME: mylclsubmit.m
!echo "mylclsubmit.m"
!hostname
sched = findResource('scheduler', 'type', 'local');
set(sched,'ClusterMatlabRoot',matlabroot);
pjob = createMatlabPoolJob(sched);
set(pjob,'MinimumNumberOfWorkers',5)
set(pjob,'MaximumNumberOfWorkers',5)
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
pjob.wait;
result = getAllOutputArguments(pjob);
result{1}
quit;
#!/bin/sh -l
# FILENAME: myjob.sub
echo "myjob.sub"
hostname
module load matlab/R2011b
cd $PBS_O_WORKDIR
unset DISPLAY
# -nodisplay: run MATLAB in text mode; X11 server not needed
# -r: read MATLAB program; use MATLAB JIT Accelerator
matlab -nodisplay -r mylclsubmit
$ qsub -l nodes=1:ppn=6,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
99025.radon-ad myusername standby myjob.sub 30197 1 6 -- 00:01 R 00:00
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a639.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
mylclsubmit.m
radon-a639.rcac.purdue.edu
>> FINISHED SUBMITTING
ans =
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
hostname numlabs labindex
------------------------------- ------- --------
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 1
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 2
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 3
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 4
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Elapsed time in parallel region: 3.587376
set(pjob,'MinimumNumberOfWorkers',13);
set(pjob,'MaximumNumberOfWorkers',13);
$ qsub -l nodes=1:ppn=14,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using distcomp.simpleparalleljob/pSetMinimumNumberOfWorkers (line 59)
MinimumNumberOfWorkers must be the same as or less than MaximumNumberOfWorkers
for a job
Error in mylclsubmit (line 9)
set(pjob,'MinimumNumberOfWorkers',13);
% FILENAME: myscript.m
% SERIAL REGION
[c name] = system('hostname');
fprintf('SERIAL REGION: hostname:%s\n', name)
matlabpool open 4;
fprintf(' hostname numlabs labindex\n')
fprintf(' ------------------------------- ------- --------\n')
tic;
% PARALLEL REGION
spmd
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('PARALLEL REGION: %-31s %7d %8d\n', name,numlabs,labindex)
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel region
matlabpool close force;
fprintf('\n')
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('SERIAL REGION: hostname:%s\n', name)
fprintf('Elapsed time in parallel region: %f\n', elapsed_time)
quit;
% FILENAME: myfunction.m
function result = myfunction ()
% SERIAL REGION
% Variable "r" is a "composite object."
[c name] = system('hostname');
result = sprintf('SERIAL REGION: hostname:%s', name);
matlabpool open 4;
r = sprintf(' hostname numlabs labindex');
result = strvcat(result,r);
r = sprintf(' ------------------------------- ------- --------');
result = strvcat(result,r);
tic;
% PARALLEL REGION
spmd
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('PARALLEL REGION: %-31s %7d %8d', name,numlabs,labindex);
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel region
for ndx=1:length(r) % concatenate composite object "r"
result = strvcat(result,r{ndx});
end
matlabpool close force;
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('\nSERIAL REGION: hostname:%s', name);
result = strvcat(result,r);
r = sprintf('elapsed time: %f', elapsed_time);
result = strvcat(result,r);
end
#!/bin/sh -l
# FILENAME: myjob.sub
echo "myjob.sub"
hostname
module load matlab/R2011b
cd $PBS_O_WORKDIR
unset DISPLAY
# -nodisplay: run MATLAB in text mode; X11 server not needed
# -r: read MATLAB program; use MATLAB JIT Accelerator
matlab -nodisplay -r myscript
# matlab -nodisplay -r myfunction
$ matlab -nodisplay
>> defaultParallelConfig('mypbsconfig');
>> quit;
$
$ qsub -l nodes=1:ppn=1,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+4 myjob.sub
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
332026.radon-ad myusername standby myjob.sub 31850 1 1 -- 00:01 R 00:00
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
332026.radon-ad myusername standby myjob.sub 31850 1 1 -- 00:01 R 00:00
332028.radon-ad myusername standby Job1 668 4 4 -- 00:01 R 00:00
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a639.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Starting matlabpool using the 'mypbsconfig' configuration ... connected to 4 labs.
hostname numlabs labindex
------------------------------- ------- --------
Lab 2:
PARALLEL REGION: radon-a638.rcac.purdue.edu 4 2
Lab 1:
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 1
Lab 3:
PARALLEL REGION: radon-a637.rcac.purdue.edu 4 3
Lab 4:
PARALLEL REGION: radon-a636.rcac.purdue.edu 4 4
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Elapsed time in parallel region: 3.382151
$ matlab -nodisplay
>> defaultParallelConfig('local');
>> quit;
$
$ qsub -l nodes=1:ppn=5,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115287.radon-ad myusername standby myjob.sub 24010 1 5 -- 00:01 R 00:00
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a639.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Starting matlabpool using the 'local' configuration ... connected to 4 labs.
hostname numlabs labindex
------------------------------- ------- --------
Lab 1:
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 1
Lab 2:
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 2
Lab 3:
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 3
Lab 4:
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 4
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Elapsed time in parallel region: 3.425426
matlabpool open 13;
$ qsub -l nodes=1:ppn=13,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using matlabpool (line 136)
Failed to open matlabpool. (For information in addition to the causing error,
validate the configuration 'local' in the Configurations Manager.)
Error in myscript (line 6)
matlabpool open 13;
Caused by:
Error using distcomp.interactiveclient/start (line 88)
Failed to start matlabpool.
This is caused by:
You requested a minimum of 13 workers but only 12 workers are allowed with
the local scheduler.
% FILENAME: myscript.m
% SERIAL REGION
[c name] = system('hostname');
fprintf('SERIAL REGION: hostname:%s\n', name)
matlabpool open 4;
fprintf(' hostname numlabs labindex\n')
fprintf(' ------------------------------- ------- --------\n')
tic;
% PARALLEL REGION
spmd
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('PARALLEL REGION: %-31s %7d %8d\n', name,numlabs,labindex)
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel region
matlabpool close force;
fprintf('\n')
[c name] = system('hostname');
name = name(1:length(name)-1);
fprintf('SERIAL REGION: hostname:%s\n', name)
fprintf('Elapsed time in parallel region: %f\n', elapsed_time)
quit;
% FILENAME: myfunction.m
function result = myfunction ()
% SERIAL REGION
% Variable "r" is a "composite object."
[c name] = system('hostname');
result = sprintf('SERIAL REGION: hostname:%s', name);
matlabpool open 4;
r = sprintf(' hostname numlabs labindex');
result = strvcat(result,r);
r = sprintf(' ------------------------------- ------- --------');
result = strvcat(result,r);
tic;
% PARALLEL REGION
spmd
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('PARALLEL REGION: %-31s %7d %8d', name,numlabs,labindex);
pause(2);
end
% SERIAL REGION
elapsed_time = toc; % get elapsed time in parallel region
for ndx=1:length(r) % concatenate composite object "r"
result = strvcat(result,r{ndx});
end
matlabpool close force;
[c name] = system('hostname');
name = name(1:length(name)-1);
r = sprintf('\nSERIAL REGION: hostname:%s', name);
result = strvcat(result,r);
r = sprintf('Elapsed time in parallel region: %f', elapsed_time);
result = strvcat(result,r);
end
#!/bin/sh -l
# FILENAME: myjob.sub
echo "myjob.sub"
hostname
cd $PBS_O_WORKDIR
unset DISPLAY
./run_myscript.sh /apps/rhel5/MATLAB/R2011b
$ module load matlab/R2011b
$ module load gcc/4.6.2
$ module list
1) matlab/R2011b 2) gcc/4.6.2
$ matlab -nodisplay
>> defaultParallelConfig('mypbsconfig');
>> quit
$ mcc -m myscript.m
#!/bin/sh
# script for execution of deployed applications
#
# Sets up the MCR environment for the current $ARCH and executes
# the specified command.
#
exe_name=$0
exe_dir=`dirname "$0"`
echo "run_myscript.sh"
hostname
echo "------------------------------------------"
if [ "x$1" = "x" ]; then
echo Usage:
echo $0 \
$ qsub -l nodes=1:ppn=1,walltime=00:05:00,gres=MATLAB_Distrib_Comp_Server+4 myjob.sub
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115292.radon-ad myusername standby myjob.sub 28611 1 1 5957 00:05 R 00:00
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115292.radon-ad myusername standby myjob.sub 28611 1 1 5957 00:05 R 00:00
115293.radon-ad myusername standby Job1 29390 4 4 7005 00:01 R 00:00
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a639.rcac.purdue.edu
run_myscript.sh
radon-a639.rcac.purdue.edu
------------------------------------------
Setting up environment variables
---
LD_LIBRARY_PATH is .:/apps/rhel5/MATLAB_R2010a/runtime/glnxa64:/apps/rhel5/MATLAB_R2010a/bin/glnxa64:/apps/rhel5/MATLAB_R2010a/sys/os/glnxa6
4:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/s
erver:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/client:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64
Warning: No display specified. You will not be able to display graphics on the screen.
SERIAL REGION: hostname:radon-a639.rcac.purdue.edu
Starting matlabpool using the 'mypbsconfig' configuration ... connected to 4 labs.
hostname numlabs labindex
------------------------------- ------- --------
Lab 2:
PARALLEL REGION: radon-a638.rcac.purdue.edu 4 2
Lab 3:
PARALLEL REGION: radon-a636.rcac.purdue.edu 4 3
Lab 4:
PARALLEL REGION: radon-a633.rcac.purdue.edu 4 4
Lab 1:
PARALLEL REGION: radon-a639.rcac.purdue.edu 4 1
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
SERIAL REGION: hostname:radon--a639.rcac.purdue.edu
Elapsed time in parallel region: 2.930676
% FILENAME: mywrapper.m
result = myfunction();
disp(result)
quit;
#!/bin/sh -l
# FILENAME: myjob.sub
echo "myjob.sub"
hostname
cd $PBS_O_WORKDIR
unset DISPLAY
./run_mywrapper.sh /apps/rhel5/MATLAB/R2011b
$ mcc -m mywrapper.m myfunction.m
$ mkdir test
$ cp mywrapper test
$ cp run_mywrapper.sh test
$ cp myjob.sub test
$ cd test
$ qsub -l nodes=1:ppn=1,walltime=00:05:00,gres=MATLAB_Distrib_Comp_Server+4 myjob.sub
$ module load matlab/R2011b
$ module load gcc/4.6.2
$ module list
1) matlab/R2011b 2) gcc/4.6.2
$ matlab -nodisplay
>> defaultParallelConfig('local');
>> quit
$ mcc -m myscript.m
#!/bin/sh
# script for execution of deployed applications
#
# Sets up the MCR environment for the current $ARCH and executes
# the specified command.
#
exe_name=$0
exe_dir=`dirname "$0"`
echo "run_myscript.sh"
hostname
echo "------------------------------------------"
if [ "x$1" = "x" ]; then
echo Usage:
echo $0 \
$ qsub -l nodes=1:ppn=4,walltime=00:05:00 myjob.sub
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115292.radon-ad myusername standby myjob.sub 28611 1 4 -- 00:05 R 00:00
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a299.rcac.purdue.edu
run_myscript.sh
radon-a299.rcac.purdue.edu
------------------------------------------
Setting up environment variables
---
LD_LIBRARY_PATH is .:/apps/rhel5/MATLAB_R2010a/runtime/glnxa64:/apps/rhel5/MATLAB_R2010a/bin/glnxa64:/apps/rhel5/MATLAB_R2010a/sys/os/glnxa6
4:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/s
erver:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/client:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64
Warning: No display specified. You will not be able to display graphics on the screen.
SERIAL REGION: hostname:radon-a299.rcac.purdue.edu
Starting matlabpool using the 'local' configuration ... connected to 4 labs.
hostname numlabs labindex
------------------------------- ------- --------
Lab 1:
PARALLEL REGION: radon-a299.rcac.purdue.edu 4 1
Lab 2:
PARALLEL REGION: radon-a299.rcac.purdue.edu 4 2
Lab 3:
PARALLEL REGION: radon-a299.rcac.purdue.edu 4 3
Lab 4:
PARALLEL REGION: radon-a299.rcac.purdue.edu 4 4
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
SERIAL REGION: hostname:radon-a299.rcac.purdue.edu
Elapsed time in parallel region: 2.583002
% FILENAME: mywrapper.m
result = myfunction();
disp(result)
quit;
#!/bin/sh -l
# FILENAME: myjob.sub
echo "myjob.sub"
hostname
cd $PBS_O_WORKDIR
unset DISPLAY
./run_mywrapper.sh /apps/rhel5/MATLAB/R2011b
$ mcc -m mywrapper.m myfunction.m
$ qsub -l nodes=1:ppn=4,walltime=00:01:00 myjob.sub
matlabpool open 13;
qsub -l nodes=1:ppn=13,walltime=00:01:00 myjob.sub
{Error using matlabpool (line 136)
Failed to open matlabpool. (For information in addition to the causing error,
validate the configuration 'local' in the Configurations Manager.)
Error in myscript (line 6)
Caused by:
Error using distcomp.interactiveclient/start (line 88)
Failed to start matlabpool.
This is caused by:
You requested a minimum of 13 workers but only 12 workers are allowed with
the local scheduler.
}
distcomp:matlabpool:RunValidation
MATLAB Distributed Computing Server (distributed job)
Method
MATLAB
PCT
DCS
mcc
submit() with 'torque' scheduler
1
1
number of tasks
0
submit() with 'local' scheduler, qsub
1
1
0
0
% FILENAME: mypbssubmit.m
sched = findResource('scheduler','type','torque');
set(sched,'ClusterMatlabRoot',matlabroot);
set(sched,'SubmitArguments','-l walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+11');
job = createJob(sched);
% Make several new tasks in a job.
% Here, the number of tasks is three more than the number of processor cores per compute node.
for i = 1:11
task = createTask(pjob,@system,2,{'hostname'});
end
% To run your functions instead of a system function, set the
% necessary file dependencies. This tells a MATLAB worker
% (compute node) where to find the files of your functions.
% set(job,'FileDependencies',{'myf_1.m','myf_2.m','myf_3.m'});
% createTask(pjob,@myf_1,1,{});
% createTask(pjob,@myf_2,1,{});
% createTask(pjob,@myf_3,1,{});
submit(pjob);
disp('FINISHED SUBMITTING')
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay -singleCompThread
>> defaultParallelConfig('local')
>> mypbssubmit
>> !qstat -u myusername
>> disp(pjob.get('State'))
queued
>> disp(pjob.get('State'))
running
>> disp(pjob.get('State'))
finished
>> result = getAllOutputArguments(pjob)
result =
[0] [1x28 char]
[0] [1x28 char]
[0] [1x28 char]
...
[0] [1x28 char]
>> result{1:22}
>> pjob.destroy;
>> quit;
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
160901.radon-ad kes standby Job4Task1 -- 1 1 -- 00:01 Q --
160902.radon-ad kes standby Job4Task2 -- 1 1 -- 00:01 Q --
160903.radon-ad kes standby Job4Task3 -- 1 1 -- 00:01 Q --
...
160911.radon-ad kes standby Job4Task11 -- 1 1 -- 00:01 Q --
ans = radon-a002.rcac.purdue.edu
ans = radon-a002.rcac.purdue.edu
ans = radon-a002.rcac.purdue.edu
ans = radon-a002.rcac.purdue.edu
ans = radon-a002.rcac.purdue.edu
ans = radon-a004.rcac.purdue.edu
ans = radon-a004.rcac.purdue.edu
ans = radon-a004.rcac.purdue.edu
ans = radon-a004.rcac.purdue.edu
ans = radon-a004.rcac.purdue.edu
ans = radon-a005.rcac.purdue.edu
ans = radon-a005.rcac.purdue.edu
ans = radon-a005.rcac.purdue.edu
ans = radon-a005.rcac.purdue.edu
ans = radon-a005.rcac.purdue.edu
ans = radon-a006.rcac.purdue.edu
...
ans = radon-a006.rcac.purdue.edu
/var/spool/PBS/mom_priv/jobs/1808906[8].radon-adm.rcac.purdue.edu.SC: line 29: cd: /tmp/pbs.1808906[8].radon-adm.rcac.purdue.edu: No such
file or directory
Executing: /apps/rhel5/MATLAB/R2011b/bin/worker
License checkout failed.
License Manager Error -4
Maximum number of users for MATLAB_Distrib_Comp_Engine reached.
Try again later.
To see a list of current users use the lmstat utility or contact your License Administrator.
Troubleshoot this issue by visiting:
http://www.mathworks.com/support/lme/R2011b/4
Diagnostic Information:
Feature: MATLAB_Distrib_Comp_Engine
License path: /home/myusername/.matlab/R2011b_licenses:/apps/rhel5/MATLAB/R2011b/licenses/license.dat:/apps/rhe
l5/MATLAB_R2011b/licenses/ecn.lic:/apps/rhel5/MATLAB/R2011b/licenses/mdce.lic
FLEXnet Licensing error: -4,132.
MATLAB exited with code: 1
$ module load matlab/R2011b
$ matlab -nodisplay
>> sched=findResource('scheduler','type','torque');
>> job=findJob(sched,'State','finished');
>> result=getAllOutputArguments(job);
result =
[0] [1x28 char]
[0] [1x28 char]
[0] [1x28 char]
...
[0] [1x28 char]
>> result{1:22}
>> job.destroy;
>> quit;
$
% FILENAME: mylclsubmit.m
sched = findResource('scheduler','type','local');
set(sched,'ClusterMatlabRoot',matlabroot);
job = createJob(sched);
task = createTask(j,@system,2,{{'hostname'},{'hostname'},{'hostname'},{'hostname'},{'hostname'},{'hostname'},{'hostname'},{'hostname'},{'hostname'}});
submit(j);
disp('FINISHED SUBMITTING')
waitForState(job);
results = getAllOutputArguments(job);
results{1:18}
quit;
#!/bin/sh -l
# FILENAME: myjob.sub
echo "myjob.sub"
hostname
module load matlab/R2011b
cd $PBS_O_WORKDIR
unset DISPLAY
# -nodisplay: run MATLAB in text mode; X11 server not needed
# -r: read MATLAB program; use MATLAB JIT Accelerator
matlab -nodisplay -r mylclsubmit
$ qsub -l nodes=1,gres=Parallel_Computing_Toolbox+1 myjob.sub
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----
161181.radon -ad myusername standby Job1Task1 -- 1 1 -- 00:01 Q --
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a002.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
FINISHED SUBMITTING
ans = radon-a002.rcac.purdue.edu
ans = radon-a002.rcac.purdue.edu
ans = radon-a002.rcac.purdue.edu
...
ans = radon-a002.rcac.purdue.edu
MATLAB Distributed Computing Server (parallel job)
Method
Description
MATLAB
PCT
DCS
mcc
Limitations
1
batch() with user-defined PBS configuration
1
1
Matlabpool + 1
0
number of MATLAB,PCT,DCS licenses purchased
2
batch() with 'local' configuration, qsub
1
1
0
0
local configuration with 8 (R2009a) and 12 (R2011a) workers
3
submit() with 'torque' scheduler
1
1
MaximumNumberOfWorkers
0
number of MATLAB,PCT,DCS licenses purchased
4
submit() with 'local' scheduler, qsub
1
1
0
0
local scheduler with 8 (R2009a) and 12 (R2011a) workers
5
qsub with user-defined PBS configuration
1
1
pool size
0
number of MATLAB,PCT,DCS licenses purchased
6
qsub with 'local' configuration
1
1
0
0
local configuration with 8 (R2009a) and 12 (R2011a) workers
7
Compiler with user-defined PBS configuration, qsub
0
0
pool size
1
number of DCS licenses purchased
8
Compiler with 'local' configuration, qsub
0
0
0
1
local configuration with 8 (R2009a) and 12 (R2011a) workers
9
Compiler with user-defined PBS configuration, qsub
0
0
MaximumNumberOfWorkers
1
number of DCS licenses purchased
10
Compiler with 'local' configuration, qsub
0
0
0
1
local configuration with 8 (R2009a) and 12 (R2011a) workers
% FILENAME: myscript.m
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenate to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
result = gcat(str,1,1);
if (labindex == 1)
disp(result)
end
% FILENAME: myfunction.m
function result = myfunction ()
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000))
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1)
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenation to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
result = gcat(str,1,1);
end
% FILENAME: myscript.m
% Convert this parallel job to a pool job.
spmd
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenate to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
result = gcat(str,1,1)
if labindex == 1
disp(result)
end
end % spmd
% FILENAME: myfunction.m
function result = myfunction ()
result = 0;
% Convert this parallel job to a pool job.
spmd
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenate to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
rslt = gcat(str,1,1);
if (labindex == 1) disp(result)
end % spmd
result = rslt{1};
end % function
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay
>> disp(defaultParallelConfig);
local
>> pjob=batch('myscript','Matlabpool',4,'Configuration','mypbsconfig','CaptureDiary',true);
>> !qstat -u myusername
>> disp(pjob.get('State'))
queued
>> disp(pjob.get('State'))
running
>> license('inuse')
distrib_computing_toolbox
matlab
>> disp(pjob.get('State'))
finished
>> license('inuse')
distrib_computing_toolbox
matlab
>> pjob.diary;
>> ls -l
>> pjob.destroy;
>> ls -l
>> disp(defaultParallelConfig);
local
>> quit;
$
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
156491.radon-a myusername standby Job1 -- 5 5 -- 00:01 Q --
Lab 1:
radon-a001.rcac.purdue.edu:4:1:1000
radon-a002.rcac.purdue.edu:4:2:1000
radon-a003.rcac.purdue.edu:4:3:1000
radon-a004.rcac.purdue.edu:4:4:1000
$ module load matlab/R2011b
$ matlab -nodisplay
>> sched=findResource('scheduler','Configuration','mypbsconfig');
>> pjob=findJob(sched,'State','finished');
>> pjob.diary;
>> pjob.destroy;
>> quit;
$
>> pjob=batch('myfunction','Matlabpool',4,'Configuration','mypbsconfig','CaptureDiary',true);
>> disp(pjob.get('State'))
finished
>> pjob.diary
>> pjob=batch('myfunction',1,{},'Matlabpool',4,'Configuration','mypbsconfig');
>> disp(pjob.get('State'))
finished
>> result=getAllOutputArguments(pjob);
>> result{1}
>> pjob=batch(@myfunction,1,{},'Matlabpool',4,'Configuration','mypbsconfig');
>> disp(pjob.get('State'))
finished
>> result=getAllOutputArguments(pjob);
>> result{1}
% FILENAME: mylclbatch.m
!echo "mylclbatch.m"
!hostname
pjob=batch('myscript','Matlabpool',4,'Configuration','local','CaptureDiary',true);
pjob.wait;
pjob.diary
quit;
#!/bin/sh -l
# FILENAME: myjob.sub
echo "myjob.sub"
hostname
module load matlab/R2011b
cd $PBS_O_WORKDIR
unset DISPLAY
# -nodisplay: run MATLAB in text mode; X11 server not needed
# -r: read MATLAB program; use MATLAB JIT Accelerator
matlab -nodisplay -r mylclbatch
$ qsub -l nodes=1:ppn=6,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
99025.radon-ad myusername standby myjob.sub 30197 1 6 -- 00:01 R 00:00
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a017.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
mylclbatch.m
radon-a017.rcac.purdue.edu
Lab 1:
radon-a017.rcac.purdue.edu:4:1:1000
radon-a017.rcac.purdue.edu:4:2:1000
radon-a017.rcac.purdue.edu:4:3:1000
radon-a017.rcac.purdue.edu:4:4:1000
pjob=batch('myfunction','Matlabpool',4,'Configuration','local','CaptureDiary',true);
pjob.wait;
pjob.diary
pjob.load;
disp(ans)
result = getAllOutputArguments(pjob);
disp(result{1}.ans)
pjob=batch('myfunction',1,{},'Matlabpool',4,'Configuration','local');
pjob.wait;
result = getAllOutputArguments(pjob);
disp(result{1})
pjob=batch(@myfunction,1,{},'Matlabpool',4,'Configuration','local');
pjob.wait;
result = getAllOutputArguments(pjob);
disp(result{1})
pjob=batch('myscript','Matlabpool',12,'Configuration','local','CaptureDiary',true);
$ qsub -l nodes=1:ppn=14,walltime=00:05:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using batch (line 172)
You requested a minimum of 13 workers but only 12 workers are allowed with the
local scheduler.
Error in mylclbatch (line 6)
pjob=batch('myscript','Matlabpool',12,'Configuration','local','CaptureDiary',true);}
% FILENAME: mypbssubmit.m
sched = findResource('scheduler','type','torque');
set(sched,'ClusterMatlabRoot',matlabroot);
set(sched,'SubmitArguments','-l walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+4');
pjob=createParallelJob(sched);
set(pjob,'MinimumNumberOfWorkers',4);
set(pjob,'MaximumNumberOfWorkers',4);
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
$ module load matlab/R2011b
$ module list
1) matlab/R2011b
$ matlab -nodisplay
>> mypbssubmit
FINISHED SUBMITTING
>> !qstat -u myusername
>> disp(pjob.get('State'))
queued
>> disp(pjob.get('State'))
running
>> disp(pjob.get('State'))
finished
>> result = getAllOutputArguments(pjob)
result =
[4x39 char]
[]
[]
[]
>> result{1}
>> ls -l
>> pjob.destroy
>> ls -l
>> quit
$
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
157168.radon-a myusername standby Job1 -- 4 4 -- 00:01 Q --
radon-a000.rcac.purdue.edu:4:1:1000 radon-a001.rcac.purdue.edu:4:2:1000 radon-a002.rcac.purdue.edu:4:3:1000 radon-a003.rcac.purdue.edu:4:4:1000
Output shows that the nonnegative scalar integer which is the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers (4) defined the number of labs (4) in the pool. Because the MATLAB pool requires N labs, there must be at least N processor cores available on the cluster.
The MATLAB client scattered the four processor cores (four MATLAB labs) among four different compute nodes. Four processor cores on four different compute nodes (a000,a001,a002,a003) processed the four labs of the parallel job. Output also shows that the value of variable numlabs is the number of labs (4) and that the program assigned to each lab a unique value for variable labindex. There are four labs, so there are four lab IDs. Finally, each lab received the broadcast value: 1,000.
After the "FINISHED SUBMITTING" message, you may terminate your MATLAB client and perform other chores while your compute-intensive job runs. You may query your job with PBS command qstat entered at the Linux prompt, just like for any other PBS job. MATLAB will make job-related files in either the current working directory or a directory specified with DataLocation. These files exist until you call the MATLAB function destroy().
$ module load matlab/R2011b
$ matlab -nodisplay
>> sched=findResource('scheduler','type','torque');
>> pjob=findJob(sched,'State','finished');
>> result=getAllOutputArguments(pjob);
>> result{1}
>> pjob.destroy
>> quit
$
For practice, modify mypbssubmit.m to rerun this example as a single compute node with four processor cores:
set(sched,'SubmitArguments','-l nodes=1:ppn=4,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+4');
View job status from qstat:
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115202.radon-a myusername standby Job1 -- 1 4 -- 00:01 R --
View job output:
radon-a000.rcac.purdue.edu:4:1:1000 radon-a000.rcac.purdue.edu:4:2:1000 radon-a000.rcac.purdue.edu:4:3:1000 radon-a000.rcac.purdue.edu:4:4:1000
Output shows that processor cores of one compute node (a000) processed the entire job.
Function submit() offers an alternate mode of operation. It can read your PBS configuration made previously with the Configuration Manager. Here is an alternate version of mypbssubmit.m which relies on mypbsconfig for the several details of job submission:
% FILENAME: mypbssubmit.m
sched = findResource('scheduler','Configuration','mypbsconfig');
pjob=createParallelJob(sched);
set(pjob,'MinimumNumberOfWorkers',4);
set(pjob,'MaximumNumberOfWorkers',4);
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
To scale up this method to handle a real application, increase the wall time to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers which appear as arguments in the calls of function set(). The maximum possible size of the pool is the number of DCS licenses purchased. Finally, increase the value of MATLAB_Distrib_Comp_Server to match the value of MinimumNumberOfWorkers.
The fourth method of job submission uses the PBS qsub command to submit a parallel job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar the third method since it uses function submit() and the same MATLAB function M-file myfunction.m (MATLAB function submit() accepts only a function M-file). This method differs from the third because the MATLAB client runs on a compute node rather than on the front end and the client uses the 'local' scheduler rather than the MATLAB 'torque' scheduler.
Prepare a MATLAB script M-file which finds the MATLAB 'local' scheduler, which defines a MATLAB parallel job with four workers (labs) and one task, and which calls MATLAB function submit(). Use an appropriate filename, here named mylclsubmit.m:
% FILENAME: mylclsubmit.m
!echo "mylclsubmit.m"
!hostname
sched = findResource('scheduler','type','local');
set(sched,'ClusterMatlabRoot',matlabroot);
pjob=createParallelJob(sched);
set(pjob,'MinimumNumberOfWorkers',4);
set(pjob,'MaximumNumberOfWorkers',4);
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING');
pjob.wait;
result = getAllOutputArguments(pjob);
disp(result{1});
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -r mylclsubmit
Submit the job as a single compute node with five processor cores and request one PCT license:
$ qsub -l nodes=1:ppn=5,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
One processor core runs myjob.sub and mylclsubmit.m, and four processor cores run the four copies of the parallel job.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
157171.radon-a myusername standby myjob.sub -- 1 5 -- 00:01 Q --
Job status shows five processor cores (TSK) on a single compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a000.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
mylclsubmit.m
radon-a000.rcac.purdue.edu
FINISHED SUBMITTING
radon-a000.rcac.purdue.edu:4:1:1000
radon-a000.rcac.purdue.edu:4:2:1000
radon-a000.rcac.purdue.edu:4:3:1000
radon-a000.rcac.purdue.edu:4:4:1000
Output shows that the nonnegative scalar integer which is the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers (4) defined the number of labs (4) in the pool which processed the parallel job. Because the MATLAB pool requires N workers, there must be at least N processor cores available on the compute node.
Processor cores of compute node (a000) processed the entire job. Output also shows that the value of variable numlabs is the number of labs (4) and that the program assigned to each lab a unique value for variable labindex. There are four labs, so there are four lab IDs. Finally, each lab received the broadcast value: 1,000.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers which appear as arguments in the calls of function set(). The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be one greater than the value of MaximumNumberOfWorkers.
Specifying a MATLAB pool with 13 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow:
set(pjob,'MinimumNumberOfWorkers',13);
set(pjob,'MaximumNumberOfWorkers',13);
$ qsub -l nodes=1:ppn=14,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using distcomp.simpleparalleljob/pSetMinimumNumberOfWorkers (line 59)
MinimumNumberOfWorkers must be the same as or less than MaximumNumberOfWorkers
for a job
Error in mylclsubmit (line 9)
set(pjob,'MinimumNumberOfWorkers',13);
The fifth method of submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is similar to the first and third methods since it uses either a MATLAB script M-file or a MATLAB function M-file. Like the first and third methods, this method uses a user-defined PBS configuration. It differs from the other two methods because the script is slightly different. Since this method uses neither batch() nor submit() to make the MATLAB pool, two MATLAB statements specify a MATLAB pool and an spmd statement converts this parallel job to a pool job. Also, the MATLAB script M-file must quit at the end of the batch process. Another significant difference: This method uses one fewer DCS license than the first method.
Modify the MATLAB M-files myscript.m and myfunction.m with matlabpool and spmd statements. Also, modify myscript.m with the quit statement:
% FILENAME: myscript.m
% Specify pool size.
% Convert the parallel job to a pool job.
matlabpool open 4;
spmd
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenate to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
result = gcat(str,1,1);
if labindex == 1
disp(result)
end
end % spmd
matlabpool close force;
quit;
% FILENAME: myfunction.m
function result = myfunction ()
result = 0;
% Specify pool size.
% Convert the parallel job to a pool job.
matlabpool open 4;
spmd
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenate to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
rslt = gcat(str,1,1);
end % spmd
result = rslt{1};
matlabpool close force;
end % function
Prepare a job submission file with an appropriate filename, here named myjob.sub. Run with the name of either the script M-file or the function M-file:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname module load matlab/R2011b cd $PBS_O_WORKDIR unset DISPLAY # -nodisplay: run MATLAB in text mode; X11 server not needed # -r: read MATLAB program; use MATLAB JIT Accelerator matlab -nodisplay -r myscript # matlab -nodisplay -r myfunction
Run MATLAB to set the default parallel configuration to your PBS configuration:
$ matlab -nodisplay
>> defaultParallelConfig('mypbsconfig');
>> quit;
$
Submit the job as a single compute node with one processor core and request one PCT license:
$ qsub -l nodes=1:ppn=1,walltime=00:05:00,gres=Parallel_Computing_Toolbox+1%MATLAB_Distrib_Comp_Server+4 myjob.sub
This job submission causes a second job submission.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
465534.hansen-a kes standby myjob.sub 5620 1 1 -- 00:05 R 00:00
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
465534.hansen-a kes standby myjob.sub 5620 1 1 -- 00:05 R 00:00
465545.hansen-a kes standby Job2 -- 4 4 -- 00:01 R --
At first, job status shows one processor core (TSK) on one compute node (NDS). Then, job status shows four processor cores (TSK) on four compute nodes (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a006.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
>Starting matlabpool using the 'mypbsconfig' configuration ... connected to 4 labs.
Lab 1:
radon-a006.rcac.purdue.edu:4:1:1000
radon-a007.rcac.purdue.edu:4:2:1000
radon-a008.rcac.purdue.edu:4:3:1000
radon-a009.rcac.purdue.edu:4:4:1000
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
Output shows the name of one compute node (a006) that processed the job submission file myjob.sub. The job submission scattered four processor cores (four MATLAB labs) among four different compute nodes (a006,a007,a008,a009) that processed the four parallel regions. Output also shows that the value of variable numlabs is the number of labs (4) and that the program assigned to each lab a unique value for variable labindex. There are four labs, so there are four lab IDs. Each lab received the broadcast value: 1,000. Function gcat() collected in Lab 1 and from each parallel region the name of the compute node.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Secondly, increase the wall time of mypbsconfig by using the Configuration Manager in the Parallel menu to enter a new wall time in the property SubmitArguments. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of DCS licenses purchased.
The sixth method of job submission uses the PBS qsub command to submit a job to a PBS queue. If your MATLAB client is compute-intensive or you are ready to move your application to a production mode, use this method to move your client to a compute node.
This method is like the fifth method since it uses the same MATLAB M-files and the same job submission file myjob.sub. This method differs from the fifth since it uses the MATLAB 'local' configuration rather than a PBS configuration.
Run MATLAB to set the default parallel configuration to the MATLAB 'local' configuration:
$ matlab -nodisplay
>> defaultParallelConfig('local');
>> quit;
$
Submit the job as a single compute node with five processor cores and request one PCT license:
$ qsub -l nodes=1:ppn=5,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115287.radon-ad myusername standby myjob.sub 24010 1 5 -- 00:01 R 00:00
Job status shows five processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor).
Thus no job control in this shell.
myjob.sub
radon-a006.rcac.purdue.edu
< M A T L A B (R) >
Copyright 1984-2011 The MathWorks, Inc.
R2011b (7.13.0.564) 64-bit (glnxa64)
August 13, 2011
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
Starting matlabpool using the 'local' configuration ... connected to 4 labs.
Lab 1:
hansen-a006.rcac.purdue.edu:4:1:1000
hansen-a006.rcac.purdue.edu:4:2:1000
hansen-a006.rcac.purdue.edu:4:3:1000
hansen-a006.rcac.purdue.edu:4:4:1000
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.
Output shows that processor cores of one compute node (a006) processed the entire job. Output also shows that a "matlabpool using the 'local' configuration" is connected to four MATLAB labs. Unlike a parfor loop, an spmd statement sets variable numlabs to the number of labs in the pool and assigns to each lab in the pool a unique value for variable labindex.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must be one greater than the value in matlabpool open.
Specifying a MATLAB pool with 13 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow:
matlabpool open 13;
$ qsub -l nodes=1:ppn=14,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using matlabpool (line 136)
Failed to open matlabpool. (For information in addition to the causing error,
validate the configuration 'local' in the Configurations Manager.)
Error in myscript (line 6)
matlabpool open 13;
Caused by:
Error using distcomp.interactiveclient/start (line 88)
Failed to start matlabpool.
This is caused by:
You requested a minimum of 13 workers but only 12 workers are allowed with
the local scheduler.
The seventh method of job submission uses the MATLAB Compiler mcc to compile a MATLAB M-file with a PBS configuration and submits the compiled file to a PBS queue.
This method uses the same M-files as the fifth method. Like the fifth method, this method uses one fewer DCS license than the first method:
% FILENAME: myscript.m
% Specify pool size.
% Convert the parallel job to a pool job.
matlabpool open 4;
spmd
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenate to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
result = gcat(str,1,1);
if labindex == 1
disp(result)
end
end % spmd
matlabpool close force;
quit;
% FILENAME: myfunction.m
function result = myfunction ()
result = 0;
% Specify pool size.
% Convert the parallel job to a pool job.
matlabpool open 4;
spmd
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenate to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
rslt = gcat(str,1,1);
end % spmd
result = rslt{1};
matlabpool close force;
end % function
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_myscript.sh /apps/rhel5/MATLAB/R2011b
On a front end, load modules for MATLAB and GCC. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC 4.6.2 is available on Radon. Verify the loaded versions. Set the default parallel configuration to your PBS configuration and quit MATLAB. Compile the MATLAB script M-file. Make a subdirectory and copy two of the files that the compiler made and your job submission file. Make the new subdirectory the current working directory and submit:
$ module load matlab/R2011b
$ module load gcc/4.6.2
$ module list
1) matlab/R2011b 2) gcc/4.6.2
$ matlab -nodisplay
>> defaultParallelConfig('mypbsconfig');
>> quit
$ mcc -m myscript.m
To obtain the name of the compute node which runs this compiler-generated script run_myscript.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows:
#!/bin/sh # script for execution of deployed applications # # Sets up the MCR environment for the current $ARCH and executes # the specified command. # exe_name=$0 exe_dir=`dirname "$0"` echo "run_myscript.sh" hostname echo "------------------------------------------" if [ "x$1" = "x" ]; then echo Usage: echo $0 \args else echo Setting up environment variables MCRROOT="$1" echo --- LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64; MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE} ; XAPPLRESDIR=${MCRROOT}/X11/app-defaults ; export LD_LIBRARY_PATH; export XAPPLRESDIR; echo LD_LIBRARY_PATH is ${LD_LIBRARY_PATH}; shift 1 "${exe_dir}"/myfunction $* fi exit
Submit the job as a single compute node with one processor core and request four DCS licenses:
$ qsub -l nodes=1:ppn=1,walltime=00:05:00,gres=MATLAB_Distrib_Comp_Server+4 myjob.sub
This job runs on a compute node myjob.sub which in turn submits the parallel (converted to a pool) job. The first job must run at least as long as the second job since it collects the results of the parallel job.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115292.radon-ad myusername standby myjob.sub 28611 1 1 5957 00:05 R 00:00
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
115292.radon-ad myusername standby myjob.sub 28611 1 1 5957 00:05 R 00:00
115293.radon-ad myusername standby Job1 29390 4 4 7005 00:01 R 00:00
At first, job status shows one processor core (TSK) on one compute node (NDS). Then, job status shows that this job submits a second with four processor cores (TSK) on four compute nodes (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. myjob.sub radon-a006.rcac.purdue.edu run_myscript.sh radon-a006.rcac.purdue.edu ------------------------------------------ Setting up environment variables --- LD_LIBRARY_PATH is .:/apps/rhel5/MATLAB_R2010a/runtime/glnxa64:/apps/rhel5/MATLAB_R2010a/bin/glnxa64:/apps/rhel5/MATLAB_R2010a/sys/os/glnxa6 4:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/s erver:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64/client:/apps/rhel5/MATLAB_R2010a/sys/java/jre/glnxa64/jre/lib/amd64 Warning: No display specified. You will not be able to display graphics on the screen. Starting matlabpool using the 'PBSscatter' configuration ... connected to 4 labs. Lab 1: hansen-a006.rcac.purdue.edu:4:1:1000 hansen-a007.rcac.purdue.edu:4:2:1000 hansen-a008.rcac.purdue.edu:4:3:1000 hansen-a009.rcac.purdue.edu:4:4:1000 Sending a stop signal to all the labs ... stopped. Did not find any pre-existing parallel jobs created by matlabpool.
Output shows the name of the compute node (a006) that ran the job submission file myjob.sub and the compiler-generated script run_myscript.sh and the names of the four compute nodes (a006,a007,a008,a009) that ran the four scattered processor cores (four MATLAB labs) that processed the four parallel regions. Unlike a parfor loop, an spmd statement sets variable numlabs to the number of parallel regions (4) in the pool and assigns to each lab running a parallel region a unique value for variable labindex. There are four copies of the parallel job, so there are four lab IDs.
Any output written to standard error will appear in myjob.sub.emyjobid.
To apply this method of job submission to a MATLAB function M-file, prepare a wrapper script which receives and displays the result of myfunction.m. Use an appropriate filename, here named mywrapper.m:
% FILENAME: mywrapper.m result = myfunction(); disp(result) quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_mywrapper.sh /apps/rhel5/MATLAB/R2011b
Compile both the wrapper and the function then submit:
$ mcc -m mywrapper.m myfunction.m $ mkdir test $ cp mywrapper test $ cp run_mywrapper.sh test $ cp myjob.sub test $ cd test $ qsub -l nodes=1:ppn=1,walltime=00:05:00,gres=MATLAB_Distrib_Comp_Server+4 myjob.sub
To scale up this method to handle a real application, increase the wall time to accommodate a longer running parallel loop. Secondly, increase the wall time of mypbsconfig by using the Configuration Manager in the Parallel menu to enter a new wall time in the property SubmitArguments. Increase the wall time in the qsub command accordingly. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of DCS licenses acquired. Increase the value of MATLAB_Distrib_Comp_Server in the qsub command command to match the new size of the pool.
The eighth method of job submission uses the MATLAB Compiler mcc to compile a MATLAB M-file with the 'local' configuration and submits the compiled file to a PBS queue.
This method is similar to the seventh method since it uses the same MATLAB M-files and the same job submission file myjob.sub. This method differs from the seventh since it uses the MATLAB 'local' configuration.
On a front end, load modules for MATLAB and GCC. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC 4.6.2 is available on Radon. Verify the versions loaded. Set the default parallel configuration to the MATLAB 'local' configuration and compile the MATLAB script M-file. Make a subdirectory and copy two of the files that the compiler made and your job submission file. Make the new subdirectory the current working directory and submit:
$ module load matlab/R2011b
$ module load gcc/4.6.2
$ module list
1) matlab/R2010a 2) gcc/4.6.2
$ matlab -nodisplay
>> defaultParallelConfig('local')
>> quit
$ mcc -m myscript.m
To obtain the name of the compute node which runs this compiler-generated script run_myscript.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows:
#!/bin/sh # script for execution of deployed applications # # Sets up the MCR environment for the current $ARCH and executes # the specified command. # exe_name=$0 exe_dir=`dirname "$0"` echo "run_myscript.sh" hostname echo "------------------------------------------" if [ "x$1" = "x" ]; then echo Usage: echo $0 \args else echo Setting up environment variables MCRROOT="$1" echo --- LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64; MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE} ; XAPPLRESDIR=${MCRROOT}/X11/app-defaults ; export LD_LIBRARY_PATH; export XAPPLRESDIR; echo LD_LIBRARY_PATH is ${LD_LIBRARY_PATH}; shift 1 "${exe_dir}"/myfunction $* fi exit
Submit the job as a single compute node with four processor cores:
$ qsub -l nodes=1:ppn=4,walltime=00:05:00 myjob.sub
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
135783.radon-a myusername standby myjob.sub 18893 1 4 -- 00:05 R 00:00
Job status shows four processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. myjob.sub radon-a006.rcac.purdue.edu run_myscript.sh radon-a006.rcac.purdue.edu ------------------------------------------ Setting up environment variables --- Warning: No display specified. You will not be able to display graphics on the screen. Starting matlabpool using the 'local' configuration ... connected to 4 labs. Lab 1: radon-a006.rcac.purdue.edu:4:1:1000 radon-a006.rcac.purdue.edu:4:2:1000 radon-a006.rcac.purdue.edu:4:3:1000 radon-a006.rcac.purdue.edu:4:4:1000 Sending a stop signal to all the labs ... stopped. Did not find any pre-existing parallel jobs created by matlabpool.
Output shows the name of the one compute node (a006) that ran the entire job. Output also shows that the value of variable numlabs is the number of labs (4) and that the program assigned to each lab a unique value for variable labindex. There are four labs, so there are four lab IDs. Each lab received the broadcast value: 1,000. Function gcat() collected in Lab 1 and from each lab the name of the compute node.
Any output written to standard error will appear in myjob.sub.emyjobid.
To apply the eighth method of job submission to a MATLAB function M-file, prepare a wrapper script which receives and displays the result of myfunction.m. Use an appropriate filename, here named mywrapper.m:% FILENAME: mywrapper.m result = myfunction(); disp(result) quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_mywrapper.sh /apps/rhel5/MATLAB/R2011b
Compile both the wrapper and the function then submit:
$ mcc -m mywrapper.m myfunction.m $ qsub -l nodes=1:ppn=4,walltime=00:01:00 myjob.sub
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running job. Also, consider increasing the size of the MATLAB pool, the value which appears in the statement matlabpool open. The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must equal the value in the matlabpool open statement.
Specifying a MATLAB pool with 13 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow:
matlabpool open 13;
$ qsub -l nodes=1:ppn=13,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using matlabpool (line 136)
Failed to open matlabpool. (For information in addition to the causing error,
validate the configuration 'local' in the Configurations Manager.)
Error in myscript (line 6)
matlabpool open 13;
Caused by:
Error using distcomp.interactiveclient/start (line 88)
Failed to start matlabpool.
This is caused by:
You requested a minimum of 13 workers but only 12 workers are allowed with
the local scheduler.
The ninth method of job submission uses the MATLAB Compiler mcc to compile job submission details with the MATLAB 'torque' scheduler in a script M-file and the code of a parallel job in a function M-file. Since the compilation includes a call to function submit(), the program can be a parallel job. Then the method submits the compiled file to a PBS queue.
Prepare a MATLAB function M-file (function submit() accepts only a function M-file). Use an appropriate filename, here named myfunction.m:
% FILENAME: myfunction.m
function result = myfunction ()
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
% Apply global concatenate to all str's.
% Store the concatenation of str's in the first dimension (row) and on lab #1.
result = gcat(str,1,1);
end
Prepare a MATLAB script M-file which finds the MATLAB scheduler 'torque', defines a MATLAB parallel job with four workers and one task, and calls MATLAB function submit(). Use an appropriate filename, here named mypbssubmit.m:
% FILENAME: mypbssubmit.m
sched = findResource('scheduler','type','torque');
set(sched,'ClusterMatlabRoot',matlabroot);
set(sched,'SubmitArguments','-l walltime=00:01:00,gres=MATLAB_Distrib_Comp_Server+4');
pjob=createParallelJob(sched);
set(pjob,'MinimumNumberOfWorkers',4);
set(pjob,'MaximumNumberOfWorkers',4);
set(pjob,'FileDependencies',{'myfunction.m'});
task = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING')
pjob.wait;
results = getAllOutputArguments(pjob);
disp(results{1})
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_mypbssubmit.sh /apps/rhel5/MATLAB/R2011b
On a front end, load modules for MATLAB and GCC. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC Version 4.6.2 is available on radon. Verify the versions loaded. Set the default parallel configuration to your PBS configuration and quit MATLAB. Compile the MATLAB script M-file along with the code of the parallel job:
$ module load matlab/R2011b
$ module load gcc/4.6.2
$ module list
1) matlab/R2011b 2) gcc/4.6.2
$ matlab -nodisplay
>> defaultParallelConfig('mypbsconfig')
>> quit
$ mcc -m mypbssubmit.m myfunction.m
To obtain the name of the compute node which runs this compiler-generated script run_mypbssubmit.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows:
#!/bin/sh # script for execution of deployed applications # # Sets up the MCR environment for the current $ARCH and executes # the specified command. # exe_name=$0 exe_dir=`dirname "$0"` echo "run_mypbssubmit.sh" hostname echo "------------------------------------------" if [ "x$1" = "x" ]; then echo Usage: echo $0 \args else echo Setting up environment variables MCRROOT="$1" echo --- LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64; MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE} ; XAPPLRESDIR=${MCRROOT}/X11/app-defaults ; export LD_LIBRARY_PATH; export XAPPLRESDIR; echo LD_LIBRARY_PATH is ${LD_LIBRARY_PATH}; shift 1 "${exe_dir}"/myfunction $* fi exit
Submit the job as a single compute node with one processor core and request four DCS licenses:
$ qsub -l nodes=1:ppn=1,walltime=00:05:00,gres=MATLAB_Distrib_Comp_Server+4 myjob.sub
This job runs on a compute node myjob.sub which in turn submits the parallel job. The first job must run at least as long as the second job since it collects the results of the parallel job.
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
135783.radon-a myusername standby myjob.sub 18893 1 1 -- 00:05 R 00:00
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
135783.radon-a myusername standby myjob.sub 18893 1 1 -- 00:05 R 00:00
135784.radon-a myusername standby Job1 19382 4 4 -- 00:01 R 00:00
At first, job status shows one processor core (TSK) on one compute node (NDS). Then, job status shows that this job submits a second with four processor cores (TSK) on four compute nodes (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. myjob.sub radon-a002.rcac.purdue.edu run_mypbssubmit.sh radon-a002.rcac.purdue.edu ------------------------------------------ Setting up environment variables --- LD_LIBRARY_PATH is .:/apps/rhel5/MATLAB/R2011b/runtime/glnxa64:/apps/rhel5/MATLAB/R2011b/bin/glnxa64:/apps/rhel5/MATLAB/R2011b/sys/os/glnxa6 4:/apps/rhel5/MATLAB/R2011b/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:/apps/rhel5/MATLAB/R2011b/sys/java/jre/glnxa64/jre/lib/amd64/s erver:/apps/rhel5/MATLAB/R2011b/sys/java/jre/glnxa64/jre/lib/amd64/client:/apps/rhel5/MATLAB/R2011b/sys/java/jre/glnxa64/jre/lib/amd64 Warning: No display specified. You will not be able to display graphics on the screen. FINISHED SUBMITTING radon-a002.rcac.purdue.edu:4:1:1000 radon-a003.rcac.purdue.edu:4:2:1000 radon-a006.rcac.purdue.edu:4:3:1000 radon-a007.rcac.purdue.edu:4:4:1000
Output shows that the nonnegative scalar integer which is the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers (4) defined the number of labs (4) in the pool. Because the MATLAB pool requires N labs, there must be at least N processor cores available on the cluster.
Output shows the name of the compute node (a002) that ran the job submission file myjob.sub and the compiler-generated script run_mypbssubmit.sh and the names of the four compute nodes (a002,a003,a006,a007) that ran the four scattered processor cores (four MATLAB labs) that processed the four copies of the parallel job. Output also shows that the value of variable numlabs is the number of labs (4) and that the program assigned to each lab a unique value for variable labindex. There are four labs, so there are four lab IDs. Each lab received the broadcast value: 1,000. Function gcat() collected in Lab 1 and from each lab the name of the compute node.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running parallel job. Secondly, increase the wall time of mypbssubmit.m. Also, consider increasing the size of the MATLAB pool, the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers which appear as arguments in the calls of function set(). The maximum possible size of the pool is the number of DCS licenses acquired. Increase the value of MATLAB_Distrib_Comp_Server to match the value of MinimumNumberOfWorkers.
The tenth method of job submission uses the MATLAB Compiler mcc to compile job submission details with the MATLAB 'local' scheduler in a script M-file and the code of a parallel job in a function M-file. Since the compilation includes a call to function submit(), the program can be a parallel job. Then the method submits the compiled file to a PBS queue.
Prepare a MATLAB function M-file (function submit() accepts only a function M-file). Use an appropriate filename, here named myfunction.m:
function result = myfunction
if labindex == 1
% Lab (rank) #1 broadcasts an integer value to other labs (ranks).
N = labBroadcast(1,int64(1000));
else
% Each lab (rank) receives the broadcast value from lab (rank) #1.
N = labBroadcast(1);
end
% Form a string with host name, total number of labs, lab ID, and broadcast value.
[c name] =system('hostname');
name = name(1:length(name)-1);
fmt = num2str(floor(log10(numlabs))+1);
str = sprintf(['%s:%d:%' fmt 'd:%d '], name,numlabs,labindex,N);
result = gcat(str,1,1)
end % function
Prepare a MATLAB script M-file which finds the MATLAB 'local' scheduler, defines a MATLAB parallel job with four workers and one task, and calls MATLAB function submit(). Use an appropriate filename, here named mylclsubmit.m:
% FILENAME: mylclssubmit.m
!echo "mylclsubmit.m"
!hostname
sched = findResource('scheduler','type','local');
set(sched,'ClusterMatlabRoot',matlabroot);
pjob=createParallelJob(sched);
set(pjob,'MinimumNumberOfWorkers',4);
set(pjob,'MaximumNumberOfWorkers',4);
set(pjob,'FileDependencies',{'myfunction.m'});
T = createTask(pjob,@myfunction,1,{});
submit(pjob);
disp('FINISHED SUBMITTING')
pjob.wait;
results = getAllOutputArguments(pjob);
disp(results{1})
quit;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub echo "myjob.sub" hostname cd $PBS_O_WORKDIR unset DISPLAY ./run_mylclsubmit.sh /apps/rhel5/MATLAB/R2011b
On a front end, load modules for MATLAB and GCC. The MATLAB R2011b Compiler mcc depends on shared libraries from GCC Version 4.3.x. GCC 4.6.2 is available on Radon. Verify the versions loaded. Set the default parallel configuration to the MATLAB 'local' configuration (R2011a added support for compiling PCT code on the 'local' configuration) and quit MATLAB. Compile the MATLAB script M-file along with the code of the parallel job:
$ module load matlab/R2011b
$ module load gcc/4.6.2
$ module list
1) matlab/R2010a 2) gcc/4.6.2
$ matlab -nodisplay
>> defaultParallelConfig('local')
>> quit
$ mcc -m mylclsubmit.m myfunction.m
To obtain the name of the compute node which runs this compiler-generated script run_mylclsubmit.sh, insert before the echo statement the Linux commands echo and hostname so that the script appears as follows:
#!/bin/sh # script for execution of deployed applications # # Sets up the MCR environment for the current $ARCH and executes # the specified command. # exe_name=$0 exe_dir=`dirname "$0"` echo "run_mylclssubmit.sh" hostname echo "------------------------------------------" if [ "x$1" = "x" ]; then echo Usage: echo $0 \args else echo Setting up environment variables MCRROOT="$1" echo --- LD_LIBRARY_PATH=.:${MCRROOT}/runtime/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/bin/glnxa64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRROOT}/sys/os/glnxa64; MCRJRE=${MCRROOT}/sys/java/jre/glnxa64/jre/lib/amd64 ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/native_threads ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/server ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE}/client ; LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${MCRJRE} ; XAPPLRESDIR=${MCRROOT}/X11/app-defaults ; export LD_LIBRARY_PATH; export XAPPLRESDIR; echo LD_LIBRARY_PATH is ${LD_LIBRARY_PATH}; shift 1 "${exe_dir}"/myfunction $* fi exit
Submit the job as a single compute node with four processor cores:
$ qsub -l nodes=1:ppn=4,walltime=00:05:00 myjob.sub
View job status:
$ qstat -u myusername
radon-adm.rcac.purdue.edu:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
135783.radon-a myusername standby myjob.sub 18893 1 4 -- 00:05 R 00:00
Job status shows four processor cores (TSK) on one compute node (NDS).
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. myjob.sub radon-a006.rcac.purdue.edu run_mylclsubmit.sh radon-a006.rcac.purdue.edu ------------------------------------------ Setting up environment variables --- LD_LIBRARY_PATH is .:/apps/rhel5/MATLAB/R2011b/runtime/glnxa64:/apps/rhel5/MATLAB/R2011b/bin/glnxa64:/apps/rhel5/MATLAB/R2011b/sys/os/glnxa 64:/apps/rhel5/MATLAB/R2011b/sys/java/jre/glnxa64/jre/lib/amd64/native_threads:/apps/rhel5/MATLAB/R2011b/sys/java/jre/glnxa64/jre/lib/amd64 /server:/apps/rhel5/MATLAB/R2011b/sys/java/jre/glnxa64/jre/lib/amd64/client:/apps/rhel5/MATLAB/R2011b/sys/java/jre/glnxa64/jre/lib/amd64 Warning: No display specified. You will not be able to display graphics on the screen. FINISHED SUBMITTING radon-a006.rcac.purdue.edu:4:1:1000 radon-a006.rcac.purdue.edu:4:2:1000 radon-a006.rcac.purdue.edu:4:3:1000 radon-a006.rcac.purdue.edu:4:4:1000
Output shows that the nonnegative scalar integer which is the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers (4) defined the number of labs (4) in the pool. Because the MATLAB pool requires N workers, there must be at least N processor cores available on the compute.
Output shows that one compute node (a006) processed the entire job. Output also shows that the value of variable numlabs is the number of labs (4) and that the program assigned to each lab a unique value for variable labindex. There are four labs, so there are four lab IDs. Each lab received the broadcast value: 1,000. Function gcat() collected in Lab 1 and from each lab the name of the compute node.
Any output written to standard error will appear in myjob.sub.emyjobid.
To scale up this method to handle a real application, increase the wall time in the qsub command to accommodate a longer running parallel job. Also, consider increasing the size of the MATLAB pool, the value of the properties MinimumNumberOfWorkers and MaximumNumberOfWorkers which appear as arguments in the calls of function set(). The maximum possible size of the pool is the number of workers allowed in the 'local' configuration: 8 (R2009a) and 12 (R2011a). Finally, the value of ppn must equal the value of MaximumNumberOfWorkers.
Specifying a MATLAB pool with 13 labs exceeds the 'local' configuration of MATLAB R2011b. The relevant lines of code and the error follow:
set(pjob,'MinimumNumberOfWorkers',13);
set(pjob,'MaximumNumberOfWorkers',13);
$ qsub -l nodes=1:ppn=13,walltime=00:01:00,gres=Parallel_Computing_Toolbox+1 myjob.sub
{Error using distcomp.simpleparalleljob/pSetMinimumNumberOfWorkers (line 59)
MinimumNumberOfWorkers must be the same as or less than MaximumNumberOfWorkers
for a job
Error in mylclsubmit (line 9)
}
distcomp:job:InvalidProperty
For more information about parallel jobs:
GNU Octave is a high-level, interpreted, programming language for numerical computations. The Octave interpreter is the part of Octave which reads M-files, oct-files, and MEX-files and executes Octave statements. Octave is a structured language (similar to C) and mostly compatible with MATLAB. You may use Octave to avoid the need for a MATLAB license, both during development and as a deployed application. By doing so, you may be able to run your application on more systems or more easily distribute it to others.
This section illustrates how to submit a small Octave job to a PBS queue. This Octave example computes the inverse of a matrix.
Prepare an Octave-compatible M-file with an appropriate filename, here named myjob.m:
% FILENAME: myjob.m % Invert matrix A. A = [1 2 3; 4 5 6; 7 8 0] inv(A) quit
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load octave cd $PBS_O_WORKDIR unset DISPLAY # Use the -q option to suppress startup messages. # octave -q < myjob.m octave < myjob.m
The command octave myjob.m (without the redirection) also works in the preceding script.
OR:
#!/bin/sh -l # FILENAME: myjob.sub module load octave unset DISPLAY # Use the -q option to suppress startup messages. # octave -q << EOF octave << EOF % Invert matrix A. A = [1 2 3; 4 5 6; 7 8 0] inv(A) quit EOF % end of Octave commands
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. A = 1 2 3 4 5 6 7 8 0 ans = -1.77778 0.88889 -0.11111 1.55556 -0.77778 0.22222 -0.11111 0.22222 -0.11111
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about Octave:
Octave does not offer a compiler to translate an M-file into an executable file for additional speed or distribution. You may wish to consider recoding an M-file as either an oct-file or a stand-alone program.
An oct-file is an "Octave Executable". It offers a way for Octave code to call functions written in C, C++, or Fortran as though these external functions were built-in Octave functions. You may wish to use an oct-file if you would like to call an existing C, C++, or Fortran function directly from Octave rather than reimplementing that code as an Octave function. Also, by implementing performance-critical routines in C, C++, or Fortran rather than Octave, you may be able to substantially improve performance over Octave source code, especially for statements like for and while.
This section illustrates how to submit a small Octave job with an oct-file to a PBS queue. This Octave example calls a C function which adds two matrices.
Prepare a complicated and time-consuming computation in the form of a C, C++, or Fortran function. In this example, the computation is a C function which adds two matrices:
/* Computational Routine */
void matrixSum (double *a, double *b, double *c, int n) {
int i;
/* Component-wise addition. */
for (i=0; i<n; i++) {
c[i] = a[i] + b[i];
}
}
Combine the computational routine with an oct-file, which contains the necessary external function interface of Octave. The name of the file is matrixSum.cc:
* FILENAME: matrixSum.cc
*
* Adds two MxN arrays (inMatrix).
* Outputs one MxN array (outMatrix).
*
* The calling syntax is:
*
* matrixSum (inMatrix, inMatrix, outMatrix, size)
*
* This is an oct-file for Octave.
*
**********************************************************/
#include <octave/oct.h>
/* Computational Routine */
void matrixSum (double *a, double *b, double *c, int n) {
int i;
/* Component-wise addition. */
for (i=0; i<n; i++) {
c[i] = a[i] + b[i];
}
}
/* Gateway Function */
DEFUN_DLD (matrixSum, args, nargout, "matrixSum: A + B") {
NDArray inMatrix_a; /* mxn input matrix */
NDArray inMatrix_b; /* mxn input matrix */
int nrows_a,ncols_a; /* size of matrix a */
int nrows_b,ncols_b; /* size of matrix b */
NDArray outMatrix_c; /* mxn output matrix */
/* Check for proper number of input arguments */
if (args.length() != 2) {
printf("matrixSum: two inputs required.");
exit(-1);
}
/* Check for proper number of output arguments */
if (nargout != 1) {
printf("matrixSum: one output required.");
exit(-1);
}
/* Check that both input matrices are real matrices. */
if (!args(0).is_real_matrix()) {
printf("matrixSum: expecting LHS (arg 1) to be a real matrix");
exit(-1);
}
if (!args(1).is_real_matrix()) {
printf("matrixSum: expecting RHS (arg 2) to be a real matrix");
exit(-1);
}
/* Get dimensions of the first input matrix */
nrows_a = args(0).rows();
ncols_a = args(0).columns();
/* Get dimensions of the second input matrix */
nrows_b = args(1).rows();
ncols_b = args(1).columns();
/* Check for equal number of rows. */
if(nrows_a != nrows_b) {
printf("matrixSum: unequal number of rows.");
exit(-1);
}
/* Check for equal number of columns. */
if(ncols_a != ncols_b) {
printf("matrixSum: unequal number of rows.");
exit(-1);
}
/* Make a pointer to the real data in the first input matrix */
inMatrix_a = args(0).array_value();
/* Make a pointer to the real data in the second input matrix */
inMatrix_b = args(1).array_value();
/* Construct output matrix as a copy of the first input matrix. */
outMatrix_c = args(0).array_value();
/* Call the computational routine. */
double* ptr_a = inMatrix_a.fortran_vec();
double* ptr_b = inMatrix_b.fortran_vec();
double* ptr_c = outMatrix_c.fortran_vec();
matrixSum(ptr_a,ptr_b,ptr_c,nrows_a*ncols_a);
return octave_value(outMatrix_c);
}
To access the Octave utility mkoctfile, load an Octave module. Loading Octave also loads a compatible GCC:
$ module load octave
To compile matrixSum.cc into an oct-file:
$ mkoctfile matrixSum.cc
Two new files appear after the compilation:
matrixSum.o matrixSum.oct
The name of the Octave-callable oct-file is matrixSum.oct.
Prepare an Octave-compatible M-file with an appropriate filename, here named myjob.m:
% FILENAME: myjob.m % Call the separately compiled and dynamically linked oct-file. A = [1,1,1;1,1,1] B = [2,2,2;2,2,2] C = matrixSum(A,B) quit
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load octave cd $PBS_O_WORKDIR unset DISPLAY # Use the -q option to suppress startup messages. # octave -q < myjob.m octave < myjob.m
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. A = 1 1 1 1 1 1 B = 2 2 2 2 2 2 C = 3 3 3 3 3 3
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about the Octave oct-file:
A stand-alone Octave program is a C, C++, or Fortran program which calls user-written oct-files and the same libraries that Octave uses. A stand-alone program has access to Octave objects, such as the array and matrix classes, as well as all the Octave algorithms. If you would like to implement performance-critical routines in C, C++, or Fortran and still call select Octave functions, a stand-alone Octave program may be a good option. This offers the possibility for substantially improved performance over Octave source code, especially for statements like for and while while still allowing use of specialized Octave functions where useful.
This section illustrates how to submit a small, stand-alone Octave program to a PBS queue. This C++ example uses class Matrix and calls an Octave script which prints a message.
Prepare an Octave-compatible M-file with an appropriate filename, here named hello.m:
% FILENAME: hello.m
disp('hello.m: hello, world')
Prepare a C++ function file with the necessary external function interface and with an appropriate filename, here named hello.cc:
// FILENAME: hello.cc
#include <iostream>
#include <octave/oct.h>
#include <octave/octave.h>
#include <octave/parse.h>
#include <octave/toplev.h> /* do_octave_atexit */
int main (const int argc, char ** argv) {
const char * argvv [] = {"" /* name of program, not relevant */, "--silent"};
octave_main (2, (char **) argvv, true /* embedded */);
std::cout << "hello.cc: hello, world" << std::endl;
const octave_value_list result = feval ("hello"); /* invoke hello.m */
int n = 2;
Matrix a_matrix = Matrix (1,2);
a_matrix (0,0) = 888;
a_matrix (0,1) = 999;
std::cout << "hello.cc: " << a_matrix;
do_octave_atexit ();
}
To access the Octave utility mkoctfile, load an Octave module. Loading Octave also loads a compatible GCC:
$ module load octave
To compile the stand-alone Octave program:
$ mkoctfile --link-stand-alone hello.cc -o hello
Two new files appear after the compilation:
hello hello.o
The name of the compiled, stand-alone Octave program is hello.
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load gcc cd $PBS_O_WORKDIR unset DISPLAY hello
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. hello.cc: hello, world hello.m: hello, world hello.cc: 888 999
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about the Octave stand-alone program:
MEX stands for "MATLAB Executable". A MEX-file offers a way for MATLAB code to call functions written in C, C++ or Fortran as though these external functions were built-in MATLAB functions. You may wish to use a MEX-file if you would like to call an existing C, C++, or Fortran function directly from MATLAB rather than reimplementing that code as a MATLAB function. Also, by implementing performance-critical routines in C, C++, or Fortran rather than MATLAB, you may be able to substantially improve performance over MATLAB source code, especially for statements like for and while.
Octave includes an interface which can link compiled, legacy MEX-files. This interface allows sharing code between Octave and MATLAB users. In Octave, an oct-file will always perform better than a MEX-file, so you should write new code using the oct-file interface, if possible. However, you may test a new MEX-file in Octave then use it in a MATLAB application.
This section illustrates how to submit a small Octave job with a MEX-file to a PBS queue. This Octave example calls a C function which adds two matrices.
Prepare a complicated and time-consuming computation in the form of a C, C++, or Fortran function. In this example, the computation is a C function which adds two matrices:
/* Computational Routine */
void matrixSum (double *a, double *b, double *c, int n) {
int i;
/* Component-wise addition. */
for (i=0; i<n; i++) {
c[i] = a[i] + b[i];
}
}
Combine the computational routine with a MEX-file, which contains the necessary external function interface of MATLAB. In the computational routine, change int to mwSize. The name of the file is matrixSum.c:
/*************************************************************
* FILENAME: matrixSum.c
*
* Adds two MxN arrays (inMatrix).
* Outputs one MxN array (outMatrix).
*
* The calling syntax is:
*
* matrixSum(inMatrix, inMatrix, outMatrix, size)
*
* This is a MEX-file which Octave will execute.
*
**************************************************************/
#include "mex.h"
/* Computational Routine */
void matrixSum (double *a, double *b, double *c, mwSize n) {
mwSize i;
/* Component-wise addition. */
for (i=0; i<n; i++) {
c[i] = a[i] + b[i];
}
}
/* Gateway Function */
void mexFunction (int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[]) {
double *inMatrix_a; /* mxn input matrix */
double *inMatrix_b; /* mxn input matrix */
mwSize nrows_a,ncols_a; /* size of matrix a */
mwSize nrows_b,ncols_b; /* size of matrix b */
double *outMatrix_c; /* mxn output matrix */
/* Check for proper number of arguments */
if(nrhs!=2) {
mexErrMsgIdAndTxt("MyToolbox:matrixSum:nrhs","Two inputs required.");
}
if(nlhs!=1) {
mexErrMsgIdAndTxt("MyToolbox:matrixSum:nlhs","One output required.");
}
/* Get dimensions of the first input matrix */
nrows_a = mxGetM(prhs[0]);
ncols_a = mxGetN(prhs[0]);
/* Get dimensions of the second input matrix */
nrows_b = mxGetM(prhs[1]);
ncols_b = mxGetN(prhs[1]);
/* Check for equal number of rows. */
if(nrows_a != nrows_b) {
mexErrMsgIdAndTxt("MyToolbox:matrixSum:notEqual","Unequal number of rows.");
}
/* Check for equal number of columns. */
if(ncols_a != ncols_b) {
mexErrMsgIdAndTxt("MyToolbox:matrixSum:notEqual","Unequal number of columns.");
}
/* Make a pointer to the real data in the first input matrix */
inMatrix_a = mxGetPr(prhs[0]);
/* Make a pointer to the real data in the second input matrix */
inMatrix_b = mxGetPr(prhs[1]);
/* Make the output matrix */
plhs[0] = mxCreateDoubleMatrix(nrows_a,ncols_a,mxREAL);
/* Make a pointer to the real data in the output matrix */
outMatrix_c = mxGetPr(plhs[0]);
/* Call the computational routine */
matrixSum(inMatrix_a,inMatrix_b,outMatrix_c,nrows_a*ncols_a);
}
To access the Octave utility mkoctfile, load an Octave module. Loading Octave also loads a compatible GCC:
$ module load octave
To compile matrixSum.c into a MEX-file:
$ mkoctfile --mex matrixSum.c
Two new files appear after the compilation:
matrixSum.mex matrixSum.o
The name of the Octave-callable MEX-file is matrixSum.mex.
Prepare an Octave-compatible M-file with an appropriate filename, here named myjob.m:
% FILENAME: myjob.m % Call the separately compiled and dynamically linked oct-file. A = [1,1,1;1,1,1] B = [2,2,2;2,2,2] C = matrixSum(A,B) quit
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load octave cd $PBS_O_WORKDIR unset DISPLAY # Use the -q option to suppress startup messages. # octave -q < myjob.m octave < myjob.m
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. A = 1 1 1 1 1 1 B = 2 2 2 2 2 2 C = 3 3 3 3 3 3
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about the Octave-compatible Mex-file:
Perl is a high-level, general-purpose, interpreted, dynamic programming language offering powerful text processing features. This section illustrates how to submit a small Perl job to a PBS queue. This Perl example prints a single line of text.
Prepare a Perl input file with an appropriate filename, here named myjob.in:
# FILENAME: myjob.in print "hello, world\n"
Discover the absolute path of Perl:
$ which perl /usr/local/bin/perl
There is a second absolute path: /usr/bin/perl.
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub cd $PBS_O_WORKDIR unset DISPLAY # Use the -w option to issue warnings. /usr/bin/perl -w myjob.in
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. hello, world
Any output written to standard error will appear in myjob.sub.emyjobid.
For more information about Perl:
Python is a high-level, general-purpose, interpreted, dynamic programming language offering powerful text processing features. This section illustrates how to submit a small Python job to a PBS queue. This Python example prints a single line of text.
Prepare a Python input file with an appropriate filename, here named myjob.in:
# FILENAME: myjob.in import string, sys print "hello, world"
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load python cd $PBS_O_WORKDIR unset DISPLAY python myjob.in
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. hello, world
Any output written to standard error will appear in myjob.sub.emyjobid.
If you would like to install a python package for your own personal use, you may do so by following these directions. Make sure you have a download link to the software you want to use and substitute it on the wget line.
$ mkdir ~/src $ cd ~/src $ wget http://path/to/source/tarball/app-1.0.tar.gz $ tar xzvf app-1.0.tar.gz $ cd app-1.0 $ module load python/2.7.2 $ python setup.py install --user $ cd ~ $ python >>> import app >>> quit()
The "import app" line should return without any output if installed successfully. You can then import the package in your python scripts.
For more information about Python:
R, a GNU project, is a language and environment for statistics and graphics. It is an open source version of the S programming language. This section illustrates how to submit a small R job to a PBS queue. This R example computes a Pythagorean triple.
Prepare an R input file with an appropriate filename, here named myjob.in:
# FILENAME: myjob.in # Compute a Pythagorean triple. a = 3 b = 4 c = sqrt(a*a + b*b) c # display result
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load R cd $PBS_O_WORKDIR # --vanilla: # --no-save: do not save datasets at the end of an R session R --vanilla --no-save < myjob.in
OR:
#!/bin/sh -l # FILENAME: myjob.sub module load R # --vanilla: # --no-save: do not save datasets at the end of an R session R --vanilla --no-save << EOF # Compute a Pythagorean triple. a = 3 b = 4 c = sqrt(a*a + b*b) c # display result
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. R version 2.9.0 (2009-04-17) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > # FILENAME: myjob.in > > # Compute a Pythagorean triple. > a = 3 > b = 4 > c = sqrt(a*a + b*b) > c # display result [1] 5 >
Any output written to standard error will appear in myjob.sub.emyjobid.
To install additional R packages, create a folder in your home directory called Rlibs. You will need to be running a recent version of R (2.14.0 or greater as of this writing):
$ mkdir ~/Rlibs
If you are running the bash shell (the default on our clusters), add the following line to your .bashrc (Create the file ~/.bashrc if it doesn't already exist. You may also need to run "ln -s .bashrc .bash_profile" if .bash_profile doesn't exist either):
export R_LIBS=~/Rlibs:$R_LIBS
If you are running csh or tcsh, add the following to your .cshrc:
setenv R_LIBS ~/Rlibs:$R_LIBS
Now run "source .bashrc" and start R:
$ module load R/2.14.0 $ R > .libPaths() [1] "/home/myusername/Rlibs" [2] "/apps/rhel5/R-2.14.0/lib64/R/library"
.libPaths() should output something similar to above if it is set up correctly. Now let's try installing a package.
> install.packages('packagename',"~/Rlibs","http://streaming.stat.iastate.edu/CRAN")
The above command should download and install the requested R package, which upon completion can then be loaded.
> library('packagename')
If your R package relies on a library that's only installed as a module (for this example we'll use GDAL), you can install it by doing the following:
$ module load gdal
$ module load R
$ R
> install.packages('rgdal',"~/Rlibs","http://streaming.stat.iastate.edu/CRAN", configure.args="--with-gdal-include=$GDAL_HOME/include
--with-gdal-lib=$GDAL_HOME/lib"))
Repeat install.packages(...) for any packages that you need. Your R packages should now be installed.
For more information about R:
SAS (pronounced "sass") is an integrated system supporting statistical analysis, report generation, business planning, and forecasting. This section illustrates how to submit a small SAS job to a PBS queue. This SAS example displays a small dataset.
Prepare a SAS input file with an appropriate filename, here named myjob.sas:
* FILENAME: myjob.sas /* Display a small dataset. */ TITLE 'Display a Small Dataset'; DATA grades; INPUT name $ midterm final; DATALINES; Anne 61 64 Bob 71 71 Carla 86 80 David 79 77 Edwardo 73 73 Fannie 81 81 ; PROC PRINT data=grades; RUN;
Prepare a job submission file with an appropriate filename, here named myjob.sub:
#!/bin/sh -l # FILENAME: myjob.sub module load sas cd $PBS_O_WORKDIR # -stdio: run SAS in batch mode: # read SAS input from stdin # write SAS output to stdout # write SAS log to stderr # -nonews: do not display SAS news # SAS runs in batch mode when the name of the SAS command file # appears as a command-line argument. sas -stdio -nonews myjob
Submit the job:
$ qsub -l nodes=1 myjob.sub
View job status:
$ qstat -u myusername
View results in the file for all standard output, myjob.sub.omyjobid:
The SAS System 10:59 Wednesday, January 5, 2011 1
Obs name midterm final
1 Anne 61 64
2 Bob 71 71
3 Carla 86 80
4 David 79 77
5 Edwardo 73 73
6 Fannie 81 81
View the SAS log in the standard error file, myjob.sub.emyjobid:
1 The SAS System 12:32 Saturday, January 29, 2011
NOTE: Copyright (c) 2002-2008 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.2 (TS2M0)
Licensed to PURDUE UNIVERSITY - T&R, Site 70063312.
NOTE: This session is executing on the Linux 2.6.18-194.17.1.el5rcac2 (LINUX) platform.
NOTE: SAS initialization used:
real time 0.70 seconds
cpu time 0.03 seconds
1 * FILENAME: myjob.sas
2
3 /* Display a small dataset. */
4 TITLE 'Display a Small Dataset';
5 DATA grades;
6 INPUT name $ midterm final;
7 DATALINES;
NOTE: The data set WORK.GRADES has 6 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time 0.18 seconds
cpu time 0.01 seconds
14 ;
15 PROC PRINT data=grades;
16 RUN;
NOTE: There were 6 observations read from the data set WORK.GRADES.
NOTE: The PROCEDURE PRINT printed page 1.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.32 seconds
cpu time 0.04 seconds
NOTE: SAS Institute Inc., SAS Campus Drive, Cary, NC USA 27513-2414
NOTE: The SAS System used:
real time 1.28 seconds
cpu time 0.08 seconds
For more information about SAS:
HTCondor allows you to run jobs on systems which would otherwise be idle for however long their primary users do not need those systems. HTCondor is one of several distributed computing systems which ITaP makes available. Most ITaP research resources, in addition to being available through normal means, are a part of BoilerGrid and are accessible via HTCondor. If a primary user needs a processor core on a compute node, HTCondor immediately either checkpoints and/or migrates all HTCondor jobs on that compute node and makes that resource available to the primary user. Thus, shorter jobs will have a better completion rate via HTCondor than longer jobs; however, even though HTCondor may have to restart jobs elsewhere, BoilerGrid can offer a vast amount of computational resources to users. Not only are nearly all ITaP research systems part of BoilerGrid, so also are large numbers of lab machines at the West Lafayette and other Purdue campuses. BoilerGrid is one of the largest HTCondor pools in the world. Some machines at other institutions are also a part of a larger HTCondor federation known as DiaGrid and are available as well.
For more information:
Radon Frequently Asked Questions (FAQ)There are currently no FAQs for Radon.