Research using the Conte cluster, Phi acceleration yields optimal results from graph matching problems faster

December 18, 2014

Arif Khan isn’t just using Purdue’s Conte cluster supercomputer and its Intel Xeon Phi accelerators to develop better ways to find a needle in a haystack. He’s making it easier and much faster to find the best needles in a stack of needles.

Khan is a doctoral student in computer science who works with Professor Alex Pothen, whose lab has partnerships with Intel and Sandia National Laboratories, among others. Khan’s doctoral research focuses on using graph algorithms on high-performance computing systems like Conte, the latest of Purdue’s Community Cluster Program research supercomputers and one of the top 100 such systems in the world.

On Conte, operated for Purdue faculty by ITaP Research Computing, Khan’s techniques have allowed him to solve complex graph problems up to 50 times faster. The research was one of the award-winning papers at SC14, the largest supercomputing conference in the world, held in New Orleans in November 2014.

For information on the Community Cluster Program, email

Graph problems have a lot of applications in generating optimal answers to questions involving large and complex data sets, notably in producing “stable,” or mutually desirable, matches, for instance of children and schools or donor organs and transplant patients. Khan used the example of matching medical residents with U.S. hospitals, in which the goal is to pair the hospitals with the best possible fit among the residents and each resident with the hospital that is the best possible fit for them.

Khan is interested in the use of graph matching algorithms in privacy applications, such as allowing large data sets to be explored without exposing the identity of individuals or exposing sensitive information about them in the process. This is something potentially useful for a range of purposes, from medical studies involving patient data to the way giant retailers like Amazon examine customers’ shopping data. The 2012 Nobel Prize in economics was awarded, in part, for stable matching methods employing graph problems.

Such problems are challenging, even to a supercomputer like Conte, because they can present, perhaps, millions of candidate solutions to sort through in trying to find the best solution. That makes the problems both time and memory intensive.

But not every needle in the stack necessarily has the qualities desired. This is where the promising algorithms Khan is developing come into play. They’re designed to identify and isolate the data within the data that will generate the best results.

“Effectively, you are working with a smaller set of data to make decisions,” Khan says.

In turn, concentrating on a more manageable, more focused subset of data makes it possible to solve more complex problems without taxing the limits of available computational resources.

Using the Conte cluster, Khan has been able to solve problems to very near — 96 percent — their optimal solution and 10 times faster on one of Conte’s 16-processor nodes and up to 50 times faster using the Intel Xeon Phi accelerators in Conte, which offer up to 60 additional processors.

Phi-optimized code can take advantage of hardware support for accelerated matrix computation and dense floating-point math, offering possibilities for considerable performance improvement in mathematically intense research computation, says Michael Shuey, ITaP’s research infrastructure architect.

Khan is working now on algorithms that can efficiently take advantage of the processing power on multiple nodes and their Phi accelerators to make solving even more complex problems feasible.

Originally posted: December 18, 2014  2:56pm