Purdue’s Anvil supercomputer accelerating scientific research progress
One year into its operations, Purdue’s powerful Anvil supercomputer has already enabled groundbreaking research in fields such as drug discovery, astrophysics and sustainability research.
Anvil, which is funded by a $22 million grant from the National Science Foundation, officially went into production on February 1 of this year after beginning early user operations in November 2021. Anvil hit 100% utilization in August and is already supporting more than 2,700 users, many of them students, from 143 institutions and 42 domains of science.
“The team that put Anvil together has realized what researchers are going to be doing over the next few years and what the need is now for high-performance computing,” says Richard Wilton, an associate research scientist at Johns Hopkins University who used Anvil to carry out the analysis of whole-genome DNA sequencing data from over 600 patients with certain psychiatric disorders.
Wilton and his collaborators used a variety of bioinformatics tools on Anvil, including high-performance computing software that exploited Anvil’s GPU infrastructure to obtain twice the processing speed that they had obtained in comparable work on another national supercomputer.
With a peak processing speed of 5.1 petaFLOPs, Anvil is one of the fastest campus supercomputers in the US and debuted at number 143 on the list of the Top500 list of the world’s most powerful supercomputers.
Asif ud-Doula, an associate professor of physics at Penn State Scranton, develops numerical tools to model the stellar winds of massive stars and uses three-dimensional magnetohydrodynamics (MHD), a very computationally intensive modeling method. Generating a single 3D MHD model requires more than 60,000 core hours, estimates ud-Doula. That’s the equivalent of running a personal computer for several years, but he achieved it using hundreds of Anvil cores in less than two weeks.
Anvil consists of 1,000 nodes with two 64-core third-generation AMD EPYC processors each, and will deliver over 1 billion CPU core hours to the Advanced Cyberinfrastructure Coordination Ecosystem: Services and Support (ACCESS) each year. Anvil's nodes are interconnected with 100 Gbps Mellanox HDR InfiniBand. The supercomputer ecosystem also includes 32 large memory nodes, each with 1 TB of RAM, and 16 nodes each with four NVIDIA A100 Tensor Core GPUs providing 1.5 PF of single-precision performance to support machine learning and artificial intelligence applications.
“Because of Anvil’s state-of-the-art GPUs, we’ve gained four times better performance compared to our previous benchmarks,” says Daipayan Sarkar, a postdoc working with Josh Vermaas, an assistant professor at Michigan State University’s DOE Plant Research Laboratory to perform molecular dynamics simulations to study the movement of small molecules, such as how carbon dioxide crosses a plant plasma membrane during photosynthesis.
Yinglong Miao, an assistant professor at the Center for Computational Biology and Department of Molecular Biosciences at the University of Kansas, was an early user of Anvil who used Anvil to accelerate molecular dynamics simulations in the study of G protein activation by the beta1-adrenergic receptor, a key membrane protein that has served as a drug target for treating heart failure.
“Using CPUs this work could take months, if not years,” says Miao. “With the GPUs on Anvil, we can run these simulations much faster. Instead of needing months, we just need a couple of weeks.”
In addition to moving science forward, Anvil is also helping to train the next generation of researchers.
Min Zhang, a Purdue professor of statistics, used Anvil to train cancer researchers in her 2022 Big Data Training for Cancer Research (“Big Care”) workshop, the latest in a series of biomedical big data workshops she’s organized.
Anvil’s speed and processing power meant that this year the organizers were able to invite more participants than originally planned.
“There’s no way we could do this for so many people without Anvil,” says Zhang.
Anvil was also used by students in The Data Mine, a data science learning community at Purdue for students from all majors led by Mark Daniel Ward, professor of statistics. The Anvil team onboarded 1,300 Data Mine students shortly before the transition from the NSF’s Extreme Science and Engineering Discovery Environment (XSEDE) to ACCESS. The students were able to quickly get up to speed with the interactive access interface Open OnDemand using Jupyter Notebook.