Compiling GPU Programs on AMD GPUs

The Bell cluster nodes contain 2 AMD GPUs that support ROCm, HIP, CUDA and OpenCL. See the detailed hardware overview for the specifics on the GPUs in Bell. This section focuses on using HIP and CUDA with the ecosystem of ROCm drivers, libraries and compiler tools (including conversion tools that can transform existing CUDA codes to run on both Nvidia and AMD GPU hardware).

A simple HIP program has a basic workflow:

  • Initialize an array on the host (CPU).
  • Copy array from host memory to GPU memory.
  • Apply an operation to array on GPU.
  • Copy array from GPU memory to host memory.

Here is a sample HIP program:

Both front-ends and GPU-enabled compute nodes have the ROCm HIP tools and libraries available to compile HIP programs. To compile a HIP program, load ROCm module, and use hipcc to compile the program:

$ module load rocm
$ hipcc gpu_hello.cpp -o gpu_hello
./gpu_hello
No GPU specified, using first GPUhello, world

The above example illustrates only how to copy an array between a CPU and its GPU but does not perform a serious computation.

The following example illustrates conversion of an existing CUDA-based code to HIP programming model so that it could then be compiled and executed on AMD GPUs. The program times three square matrix multiplications on a CPU and on the global and shared memory of a GPU:

$ module load rocm
# Convert CUDA to HIP
$ hipify-perl --inplace mm.cu

# Compile with HIP compiler and run!
$ hipcc mm.cu -o mm
$ ./mm 0
                                                            speedup
                                                            -------
Elapsed time in CPU:                    7900.3 milliseconds
Elapsed time in GPU (global memory):      13.9 milliseconds  568.7
Elapsed time in GPU (shared memory):       6.4 milliseconds  1230.8

For best performance, the input array or matrix must be sufficiently large to overcome the overhead in copying the input and output data to and from the GPU.

For more information about AMD, ROCm, HIP, and GPUs:

Helpful?

Thanks for letting us know.

Please don’t include any personal information in your comment. Maximum character limit is 250.
Characters left: 250
Thanks for your feedback.