Link to section 'Compiling Source Code on Bell' of 'Compiling Source Code' Compiling Source Code on Bell

Compiling Serial Programs

A serial program is a single process which executes as a sequential stream of instructions on one processor core. Compilers capable of serial programming are available for C, C++, and versions of Fortran.

Here are a few sample serial programs:

$ module load intel
$ module load gcc
The following table illustrates how to compile your serial program:
Language Intel Compiler GNU Compiler
Fortran 77
$ ifort myprogram.f -o myprogram
$ gfortran myprogram.f -o myprogram
Fortran 90
$ ifort myprogram.f90 -o myprogram
$ gfortran myprogram.f90 -o myprogram
Fortran 95
$ ifort myprogram.f90 -o myprogram
$ gfortran myprogram.f95 -o myprogram
C
$ icc myprogram.c -o myprogram
$ gcc myprogram.c -o myprogram
C++
$ icc myprogram.cpp -o myprogram
$ g++ myprogram.cpp -o myprogram

The Intel and GNU compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".

Compiling MPI Programs

OpenMPI and Intel MPI (IMPI) are implementations of the Message-Passing Interface (MPI) standard. Libraries for these MPI implementations and compilers for C, C++, and Fortran are available on all clusters.

MPI programs require including a header file:
Language Header Files
Fortran 77
INCLUDE 'mpif.h'
Fortran 90
INCLUDE 'mpif.h'
Fortran 95
INCLUDE 'mpif.h'
C
#include <mpi.h>
C++
#include <mpi.h>

Here are a few sample programs using MPI:

To see the available MPI libraries:

$ module avail openmpi 
$ module avail impi
The following table illustrates how to compile your MPI program. Any compiler flags accepted by Intel ifort/icc compilers are compatible with their respective MPI compiler.
Language Intel MPI OpenMPI or Intel MPI (IMPI)
Fortran 77
$ mpiifort program.f -o program
$ mpif77 program.f -o program
Fortran 90
$ mpiifort program.f90 -o program
$ mpif90 program.f90 -o program
Fortran 95
$ mpiifort program.f95 -o program
$ mpif90 program.f95 -o program
C
$ mpiicc program.c -o program
$ mpicc program.c -o program
C++
$ mpiicpc program.C -o program
$ mpiCC program.C -o program

The Intel and GNU compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".

Here is some more documentation from other sources on the MPI libraries:

Compiling OpenMP Programs

All compilers installed on Brown include OpenMP functionality for C, C++, and Fortran. An OpenMP program is a single process that takes advantage of a multi-core processor and its shared memory to achieve a form of parallel computing called multithreading. It distributes the work of a process over processor cores in a single compute node without the need for MPI communications.

OpenMP programs require including a header file:
Language Header Files
Fortran 77
INCLUDE 'omp_lib.h'
Fortran 90
use omp_lib
Fortran 95
use omp_lib
C
#include <omp.h>
C++
#include <omp.h>

Sample programs illustrate task parallelism of OpenMP:

A sample program illustrates loop-level (data) parallelism of OpenMP:

To load a compiler, enter one of the following:

$ module load intel
$ module load gcc
The following table illustrates how to compile your shared-memory program. Any compiler flags accepted by ifort/icc compilers are compatible with OpenMP.
Language Intel Compiler GNU Compiler
Fortran 77
$ ifort -openmp myprogram.f -o myprogram
$ gfortran -fopenmp myprogram.f -o myprogram
Fortran 90
$ ifort -openmp myprogram.f90 -o myprogram
$ gfortran -fopenmp myprogram.f90 -o myprogram
Fortran 95
$ ifort -openmp myprogram.f90 -o myprogram
$ gfortran -fopenmp myprogram.f95 -o myprogram
C
$ icc -openmp myprogram.c -o myprogram
$ gcc -fopenmp myprogram.c -o myprogram
C++
$ icc -openmp myprogram.cpp -o myprogram
$ g++ -fopenmp myprogram.cpp -o myprogram

The Intel and GNU compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".

Here is some more documentation from other sources on OpenMP:

Compiling Hybrid Programs

A hybrid program combines both MPI and shared-memory to take advantage of compute clusters with multi-core compute nodes. Libraries for OpenMPI and Intel MPI (IMPI) and compilers which include OpenMP for C, C++, and Fortran are available.

Hybrid programs require including header files:
Language Header Files
Fortran 77
INCLUDE 'omp_lib.h'
INCLUDE 'mpif.h'
Fortran 90
use omp_lib
INCLUDE 'mpif.h'
Fortran 95
use omp_lib
INCLUDE 'mpif.h'
C
#include <mpi.h>
#include <omp.h>
C++
#include <mpi.h>
#include <omp.h>

A few examples illustrate hybrid programs with task parallelism of OpenMP:

This example illustrates a hybrid program with loop-level (data) parallelism of OpenMP:

To see the available MPI libraries:

$ module avail impi
$ module avail openmpi

The following tables illustrate how to compile your hybrid (MPI/OpenMP) program. Any compiler flags accepted by Intel ifort/icc compilers are compatible with their respective MPI compiler.

Intel MPI
Language Command
Fortran 77
$ mpiifort -openmp myprogram.f -o myprogram
Fortran 90
$ mpiifort -openmp myprogram.f90 -o myprogram
Fortran 95
$ mpiifort -openmp myprogram.f90 -o myprogram
C
$ mpiicc -openmp myprogram.c -o myprogram
C++
$ mpiicpc -openmp myprogram.C -o myprogram
OpenMPI or Intel MPI (IMPI) with Intel Compiler
Language Command
Fortran 77
$ mpif77 -openmp myprogram.f -o myprogram
Fortran 90
$ mpif90 -openmp myprogram.f90 -o myprogram
Fortran 95
$ mpif90 -openmp myprogram.f90 -o myprogram
C
$ mpicc -openmp myprogram.c -o myprogram
C++
$ mpiCC -openmp myprogram.C -o myprogram
OpenMPI or Intel MPI (IMPI) with GNU Compiler
Language Command
Fortran 77
$ mpif77 -fopenmp myprogram.f -o myprogram
Fortran 90
$ mpif90 -fopenmp myprogram.f90 -o myprogram
Fortran 95
$ mpif90 -fopenmp myprogram.f95 -o myprogram
C
$ mpicc -fopenmp myprogram.c -o myprogram
C++
$ mpiCC -fopenmp myprogram.C -o myprogram

The Intel and GNU compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix .f95.

Intel MKL Library

Intel Math Kernel Library (MKL) contains ScaLAPACK, LAPACK, Sparse Solver, BLAS, Sparse BLAS, CBLAS, GMP, FFTs, DFTs, VSL, VML, and Interval Arithmetic routines. MKL resides in the directory stored in the environment variable MKL_HOME, after loading a version of the Intel compiler with module.

By using module load to load an Intel compiler your environment will have several variables set up to help link applications with MKL. Here are some example combinations of simplified linking options:

$ module load intel
$ echo $LINK_LAPACK
-L${MKL_HOME}/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread

$ echo $LINK_LAPACK95
-L${MKL_HOME}/lib/intel64 -lmkl_lapack95_lp64 -lmkl_blas95_lp64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread

ITaP recommends you use the provided variables to define MKL linking options in your compiling procedures. The Intel compiler modules also provide two other environment variables, LINK_LAPACK_STATIC and LINK_LAPACK95_STATIC that you may use if you need to link MKL statically.

ITaP recommends that you use dynamic linking of libguide. If so, define LD_LIBRARY_PATH such that you are using the correct version of libguide at run time. If you use static linking of libguide, then:

  • If you use the Intel compilers, link in the libguide version that comes with the compiler (use the -openmp option).
  • If you do not use the Intel compilers, link in the libguide version that comes with the Intel MKL above.

Here are some more documentation from other sources on the Intel MKL:

Provided Compilers

Compilers are available on Bell for Fortran, C, and C++. Compiler sets from Intel and GNU are installed.

Detailed documentation on each compiler set available on Bell follows.

On Bell, ITaP recommends the following set of compiler and libraries for building code:

  • GCC 9.3.0
  • OpenMPI

To load the recommended set:

$ module load rcac
$ module list

More information about using these compilers:

GNU Compilers

The official name of the GNU compilers is "GNU Compiler Collection" or "GCC". To discover which versions are available:

$ module avail gcc

Choose an appropriate GCC module and load it. For example:

$ module load gcc

An older version of the GNU compiler will be in your path by default. Do NOT use this version. Instead, load a newer version using the command module load gcc.

Here are some examples for the GNU compilers:
Language Serial Program MPI Program OpenMP Program
Fortran77
$ gfortran myprogram.f -o myprogram
$ mpif77 myprogram.f -o myprogram
$ gfortran -fopenmp myprogram.f -o myprogram
Fortran90
$ gfortran myprogram.f90 -o myprogram
$ mpif90 myprogram.f90 -o myprogram
$ gfortran -fopenmp myprogram.f90 -o myprogram
Fortran95
$ gfortran myprogram.f95 -o myprogram
$ mpif90 myprogram.f95 -o myprogram
$ gfortran -fopenmp myprogram.f95 -o myprogram
C
$ gcc myprogram.c -o myprogram
$ mpicc myprogram.c -o myprogram
$ gcc -fopenmp myprogram.c -o myprogram
C++
$ g++ myprogram.cpp -o myprogram
$ mpiCC myprogram.cpp -o myprogram
$ g++ -fopenmp myprogram.cpp -o myprogram

More information on compiler options appear in the official man pages, which are accessible with the man command after loading the appropriate compiler module.

For more documentation on the GCC compilers:

Intel Compilers

One or more versions of the Intel compiler are available on Bell. To discover which ones:

$ module avail intel

Choose an appropriate Intel module and load it. For example:

$ module load intel
Here are some examples for the Intel compilers:
Language Serial Program MPI Program OpenMP Program
Fortran77
$ ifort myprogram.f -o myprogram
$ mpiifort myprogram.f -o myprogram
$ ifort -openmp myprogram.f -o myprogram
Fortran90
$ ifort myprogram.f90 -o myprogram
$ mpiifort myprogram.f90 -o myprogram
$ ifort -openmp myprogram.f90 -o myprogram
Fortran95 (same as Fortran 90) (same as Fortran 90) (same as Fortran 90)
C
$ icc myprogram.c -o myprogram
$ mpiicc myprogram.c -o myprogram
$ icc -openmp myprogram.c -o myprogram
C++
$ icpc myprogram.cpp -o myprogram
$ mpiicpc myprogram.cpp -o myprogram
$ icpc -openmp myprogram.cpp -o myprogram

More information on compiler options appear in the official man pages, which are accessible with the man command after loading the appropriate compiler module.

For more documentation on the Intel compilers:

Compiling GPU Programs on AMD GPUs

The Bell cluster nodes contain 2 AMD GPUs that support ROCm, HIP, CUDA and OpenCL. See the detailed hardware overview for the specifics on the GPUs in Bell. This section focuses on using HIP and CUDA with the ecosystem of ROCm drivers, libraries and compiler tools (including conversion tools that can transform existing CUDA codes to run on both Nvidia and AMD GPU hardware).

A simple HIP program has a basic workflow:

  • Initialize an array on the host (CPU).
  • Copy array from host memory to GPU memory.
  • Apply an operation to array on GPU.
  • Copy array from GPU memory to host memory.

Here is a sample HIP program:

Both front-ends and GPU-enabled compute nodes have the ROCm HIP tools and libraries available to compile HIP programs. To compile a HIP program, load ROCm module, and use hipcc to compile the program:

$ module load rocm
$ hipcc gpu_hello.cpp -o gpu_hello
./gpu_hello
No GPU specified, using first GPUhello, world

The above example illustrates only how to copy an array between a CPU and its GPU but does not perform a serious computation.

The following example illustrates conversion of an existing CUDA-based code to HIP programming model so that it could then be compiled and executed on AMD GPUs. The program times three square matrix multiplications on a CPU and on the global and shared memory of a GPU:

$ module load rocm
# Convert CUDA to HIP
$ hipify-perl --inplace mm.cu

# Compile with HIP compiler and run!
$ hipcc mm.cu -o mm
$ ./mm 0
                                                            speedup
                                                            -------
Elapsed time in CPU:                    7900.3 milliseconds
Elapsed time in GPU (global memory):      13.9 milliseconds  568.7
Elapsed time in GPU (shared memory):       6.4 milliseconds  1230.8

For best performance, the input array or matrix must be sufficiently large to overcome the overhead in copying the input and output data to and from the GPU.

For more information about AMD, ROCm, HIP, and GPUs:

Helpful?

Thanks for letting us know.

Please don’t include any personal information in your comment. Maximum character limit is 250.
Characters left: 250
Thanks for your feedback.