Purdue team develops powerful new machine learning technique using Gilbreth community cluster

  • February 5, 2021
  • Science Highlights

A Purdue team has used the Gilbreth community cluster operated by ITaP Research Computing to develop a new algorithm that uses the power of multiple GPU nodes to accelerate the training time of machine learning models.

“I can submit 50 to 100 jobs at the same time and because Gilbreth is such a powerful cluster, I only have to wait overnight before I have the results,” says Chih-hao Fang, the first author on the paper and a PhD student at Purdue working under the supervision of Ananth Grama, Samuel D. Conte Professor of Computer Science.

Empirical results show that the training time of their algorithm is significantly faster than other state-of-the-art distributed optimization methods.

Before beginning this project, Fang was not experienced with using Purdue’s GPU clusters, so he worked with Amiya Maji, senior computational scientist for ITaP Research Computing, to get started. Maji “patiently taught me from scratch,” Fang says, helping him install required software, submit jobs to the cluster and monitor those jobs.

Fang and his colleagues presented their work in the “GPU Algorithms and Optimizations” track of the SC20 supercomputing conference, which was held virtually in November.

In addition to Fang and Grama, co-authors on the paper include Sudhir Kylasa, a postdoctoral research associate at Purdue and former student of Grama’s, Fred Roosta, a faculty member in the school of mathematics and physics at the University of Queensland, and Michael Mahoney, associate adjunct professor of statistics at the University of California-Berkeley.

To learn more about Purdue’s Community Cluster Program, contact Preston Smith, executive director of ITaP Research Computing, psmith@purdue.edu or 49-49729.

Originally posted: February 5, 2021 3:56pm EST