Purdue Bandwidth Challenge team: Go ahead, try this at home
December 2, 2008
Purdue’s Bandwidth Challenge entry at the SuperComputing ’08 conference is more like a NASCAR stock car than an Indy 500 racer.
Instead of a specialized design you would never see on the street, nor in your average data center, which Bandwidth Challenge entries and Indy cars tend to be, a team from the Rosen Center for Advanced Computing at Purdue built its system on a standard framework, albeit with some judicious tweaking. But while Purdue’s entry is based on open-source networking software and commodity hardware, the Purdue team thinks it still has plenty under the hood.
The Bandwidth Challenge is a standard feature of the premier international conference for high performance computing, networking, storage and analysis, taking place from Nov 15-21 in Austin, Texas. The competition pushes the limits on moving data over a long-distance computer network like the Internet or the TeraGrid, the National Science Foundation-funded, world’s largest network for open science computing, in which Purdue is a partner and resource provider.
The idea is that insights and innovations flowing from the nation’s scientists and their supercomputers are fed by torrents of data moving across continents on networks like the TeraGrid, the faster and the more consistently the better.
Moreover, the lessons learned could prove valuable in Purdue’s high performance computing operations, and for anyone else who wants to make use of them.
“What we have is open source, available to anybody right now if they wanted it,” said Ramon Williamson, the Rosen Center’s senior storage engineer, who’s serving as team manager. “We’re saying use your free stuff and tweak it and you can do this, too.”
The Rosen Center is the research and discovery arm of Information Technology at Purdue (ITaP), the University’s central information technology organization.
“Our Challenge entries are all about pushing the technology to meet the demands of the scientific community,” said John Campbell, associate vice president for information technology, who heads the Rosen Center. Even the theme of Purdue’s SC08 booth—No cycle left behind, no byte left unexplored—is a nod to the creative ways ITaP is finding to improve scientific productivity, Campbell said, like the project being highlighted in the Bandwidth Challenge.
Participants in the challenge are judged on the peak amount of data they can move from one place to another and how high a sustained rate they can maintain. Indiana University and partners won last year with a peak of 18.21 gigabits per second, in a pipeline with a maximum of 20, and a 16.2 Gbps sustained rate. At 1 Gbps, a decent-sized novel transfers in about a hundredth of a second.
This year, the competition also is placing a premium on “real-world applications and data movement issues” in addition to merely filling the pipeline, however.
That got Williamson thinking about the BlueArc servers the Rosen Center recently began using to manage data traffic. The cost-effective “appliances” run field programmable gate arrays (or FPGAs), in essence computer chips that can be reprogrammed for a specific purpose, rather than being hard-wired generalists. In BlueArc’s case that job is storage traffic cop. The tailored machines offer a considerable speed advantage, Williamson said.
The trick was to translate that into an advantage over a long-distance network using standard protocols like NFS (for network file system) and TCP (transmission control protocol), which has been the purview of Williamson’s colleague Michael Shuey, high performance computing technical architect at the Rosen Center.
“We see a lot of researchers needing to move a large amount of data across the country,” Shuey said. “We thought maybe we could tune NFS to do this.”
One reason they thought that: they couldn’t find anybody who had tried since the 1990s, when networks were considerably slower and network technology less advanced.
Shuey said writing transmitted data to storage in a manner that keeps the flow clipping along wasn’t much of a problem, but reading data from storage efficiently has been a challenge. The protocols read ahead to try to keep the pipe full, but they are made for moving material from local storage actually in the computer, such as a hard disk, or from nearby within a data center. They don’t account for the time lag in a cross-country transmission, which Shuey has had to find a way around, basically by getting the system to stock up on more data in the reading-ahead process than it would normally.
Greg Veldman, storage administrator for the Rosen Center, and Patrick Finnegan, UNIX system administrator, are the other members of the team. The team is partnering with BlueArc, Texas Memory Systems, makers of fast solid-state storage units akin to a USB flash drive, and Foundry Networks, which is providing the high-speed network connections between the storage appliances and the SC08 network in Austin for the competition.
James Reaney, BlueArc’s director for research markets, said the company likes working with partners who can apply its technology to real-world problems.
“I'm looking for research customers who have both the technical expertise and the willingness to push the boundaries of storage solutions,” he said, “and that's exactly why we partnered with Purdue for their entry into the Bandwidth Challenge event.”
If successful, the demonstration may show that researchers can work effectively with their large data collections not only close to home, “but perhaps they can access, manipulate and manage all that research data from just about anywhere,” Reaney said.
The plan is to move data—for climate simulations or something analogous—from the competition storage system in Purdue’s SC08 booth in Austin to West Lafayette for processing on the Rosen Center’s Steele cluster. Meanwhile, what the Rosen Center is learning from the exercise, besides being of value to other high performance computing centers, should prove useful at home, Williamson said.
A cluster to be built at Purdue Calumet from parts of West Lafayette’s decommissioned Lear supercomputer, which was displaced by Steele, is likely to be accessing storage on the West Lafayette campus and could employ something like the system developed by the Bandwidth Challenge team.
In addition, at some point the Rosen Center, pressed for space in its machine room in the Mathematical Sciences Building, might create a central, remote storage facility in the West Lafayette area or elsewhere, which could function, using the techniques, as if it were on site.