Department of Biological Sciences
Purdue University
Brief Project Description:
Project: Performance optimization of PFT application
The parallel PFT (Parallel Fourier Transform) based application
is used extensively for the 3D Structure reconstruction of Virus
particles from Cryo-EM (Electron Microscopy) images by the
structural biologist in the structural biology group at Purdue.
The project was initiated to analyze performance bottlenecks
(both computational and memory) and address these issues. The
application is memory intensive requiring several Giga Bytes
of memory to store the 3D Map of the model and the projections.
For reconstructing a large virus such as the mimivirus, nodes with
16 GB memory per node are currently being used and that too only
one processor of the dual processor node can be used. To utilize
the other processor that is idle during the reconstruction process
we decided to use fine-grained loop level parallelism. This fine-grained
loop-level parallelism is realized using OpenMP threads. This has
resulted in a 75 % increase in throughput (comparing the pure MPI
case vs the hybrid MPI + OpenMP implementation) using the same number
of MPI ranks (processors) for both cases and 2 OpenMP threads per node
in the later hybrid case. Also, other enhancements have been made to the
code so that it can use either the FFTW library or the other clean FFT
implementation for performing it's Fourier transforms. As part of the
project the application has been ported to different platforms (Linux
clusters (Hamlet, Lear, Macbeth @ Purdue), Big Red (IBM PPC 970), IBM P690,
SGI Altix, Cray-XT3, Cray-X1E etc). We are also looking at current state
of the art algorithms in 3D - structure reconstruction being used in
other application domains.