Compiling Source Code
Documentation on compiling source code on Gautschi.
Compiling GPU Programs
The Gautschi cluster nodes contain 6 GPUs that support CUDA and OpenCL. See the detailed hardware overview for the specifics on the GPUs in Gautschi. This section focuses on using CUDA.
A simple CUDA program has a basic workflow:
- Initialize an array on the host (CPU).
- Copy array from host memory to GPU memory.
- Apply an operation to array on GPU.
- Copy array from GPU memory to host memory.
Here is a sample CUDA program:
Both front-ends and GPU-enabled compute nodes have the CUDA tools and libraries available to compile CUDA programs. To compile a CUDA program, load CUDA, and use nvcc to compile the program:
$ module load gcc/11.4.1 cuda/12.6.0
$ nvcc gpu_hello.cu -o gpu_hello
./gpu_hello
No GPU specified, using first GPUhello, world
The example illustrates only how to copy an array between a CPU and its GPU but does not perform a serious computation.
The following program times three square matrix multiplications on a CPU and on the global and shared memory of a GPU:
$ module load cuda
$ nvcc mm.cu -o mm
$ ./mm 0
speedup
-------
Elapsed time in CPU: 6555.2 milliseconds
Elapsed time in GPU (global memory): 32.9 milliseconds 199.1
Elapsed time in GPU (shared memory): 3.0 milliseconds 2191.8
For best performance, the input array or matrix must be sufficiently large to overcome the overhead in copying the input and output data to and from the GPU.
For more information about NVIDIA, CUDA, and GPUs:
Compiling Hybrid Programs
A hybrid program combines both MPI and shared-memory to take advantage of compute clusters with multi-core compute nodes. Libraries for OpenMPI and Intel MPI (IMPI) and compilers which include OpenMP for C, C++, and Fortran are available.
Language | Header Files |
---|---|
Fortran 77 |
|
Fortran 90 |
|
Fortran 95 |
|
C |
|
C++ |
|
A few examples illustrate hybrid programs with task parallelism of OpenMP:
This example illustrates a hybrid program with loop-level (data) parallelism of OpenMP:
To see the available MPI libraries:
$ module avail impi
$ module avail openmpi
The following tables illustrate how to compile your hybrid (MPI/OpenMP) program. Any compiler flags accepted by Intel ifort/icc compilers are compatible with their respective MPI compiler.
Language | Command |
---|---|
Fortran 77 |
|
Fortran 90 |
|
Fortran 95 |
|
C |
|
C++ |
|
Language | Command |
---|---|
Fortran 77 |
|
Fortran 90 |
|
Fortran 95 |
|
C |
|
C++ |
|
The Intel and GNU compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix .f95
.
Compiling Serial Programs
A serial program is a single process which executes as a sequential stream of instructions on one processor core. Compilers capable of serial programming are available for C, C++, and versions of Fortran.
Here are a few sample serial programs:
- serial_hello.f
- serial_hello.f90
- serial_hello.f95
- serial_hello.c
-
To load a compiler, enter one of the following:
$ module load intel
$ module load gcc
Language | Intel Compiler | GNU Compiler | |
---|---|---|---|
Fortran 77 |
|
|
|
Fortran 90 |
|
|
|
Fortran 95 |
|
|
|
C |
|
|
|
C++ |
|
|
The Intel and GNU compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".
Compiling MPI Programs
OpenMPI and Intel MPI (IMPI) are implementations of the Message-Passing Interface (MPI) standard. Libraries for these MPI implementations and compilers for C, C++, and Fortran are available on all clusters.
Language | Header Files |
---|---|
Fortran 77 |
|
Fortran 90 |
|
Fortran 95 |
|
C |
|
C++ |
|
Here are a few sample programs using MPI:
To see the available MPI libraries:
$ module avail openmpi
$ module avail impi
Language | Intel MPI | OpenMPI |
---|---|---|
Fortran 77 |
|
|
Fortran 90 |
|
|
Fortran 95 |
|
|
C |
|
|
C++ |
|
|
The Intel and GNU compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".
Here is some more documentation from other sources on the MPI libraries:
Compiling OpenMP Programs
All compilers installed on Brown include OpenMP functionality for C, C++, and Fortran. An OpenMP program is a single process that takes advantage of a multi-core processor and its shared memory to achieve a form of parallel computing called multithreading. It distributes the work of a process over processor cores in a single compute node without the need for MPI communications.
Language | Header Files |
---|---|
Fortran 77 |
|
Fortran 90 |
|
Fortran 95 |
|
C |
|
C++ |
|
Sample programs illustrate task parallelism of OpenMP:
A sample program illustrates loop-level (data) parallelism of OpenMP:
To load a compiler, enter one of the following:
$ module load intel
$ module load gcc
Language | Intel Compiler | GNU Compiler |
---|---|---|
Fortran 77 |
|
|
Fortran 90 |
|
|
Fortran 95 |
|
|
C |
|
|
C++ |
|
|
The Intel and GNU compilers will not output anything for a successful compilation. Also, the Intel compiler does not recognize the suffix ".f95".
Here is some more documentation from other sources on OpenMP: