Free, fast BLAST processing now available on Purdue’s DiaGrid system

November 29, 2011

BLAST, the popular bioinformatics software, is now available to Purdue faculty and their students on Purdue’s DiaGrid distributed computing system, which can make thousands of processors available at once for BLAST jobs — at no cost to users.

ITaP also has developed an online application to provide ready access to BLAST on DiaGrid using Purdue’s HUBzero platform, which brings computational research software and access to high-performance and cloud computing as close as the Web browser.

“DiaGrid is akin to offering me a different, and potentially better, hammer to do my work,” says Rick Westerman, bioinformatics specialist at the Purdue Genomics Facility, who uses BLAST to test hundreds to hundreds of thousands of potential protein sequences against a protein database.

Interested faculty, research staff and graduate students are invited to learn about the new availability of BLAST on DiaGrid during a luncheon from 11:30 a.m. to 1 p.m. Friday, Dec. 9, in Stewart Center, Room 322.

Registration — and lunch — is free. Unregistered participants are welcome, but organizers are asking people to register so enough lunches can be ordered.

The luncheon will include presentations from current BLAST users testing the DiaGrid system.

BLAST on DiaGrid offers:

  • Free cycles on nearly 50,000 processors, which can radically speed up job processing times. (In a typical month, users tap 2 million CPU hours.)

    • A familiar Web interface using Purdue’s HUBzero platform for scientific collaboration.
    • A no-cost, no-hassle signup process. BLAST (it stands for Basic Local Alignment Search Tool) is widely used for studying biological sequence information, such as the amino acid chains that make up protein molecules or DNA sequences.

    “DiaGrid is particularly suited for research using BLAST because it involves numerous serial computations that can be parceled out to any number of available processors,” says Carol Song of the Rosen Center for Advanced Computing, ITaP’s research computing unit. “Generally, the more processors the better, and DiaGrid can make thousands of processors available at a time.”

    While BLAST on DiaGrid is ready for use now, ITaP is seeking feedback from Purdue researchers to refine the Web hub and make future versions even easier to use, Song says.

    DiaGrid taps computers not in use at the moment in offices, student computer labs, cluster supercomputers and more. It is based on the Condor distributed computing system, which works by pooling computers over the Purdue campus network and off campus via the Internet and fast research networks. Whenever computers in the pool aren’t in use — at night, when their owners are at lunch, and so on — the system sends them work. When a computer’s owner returns, active jobs automatically get shifted to idle machines in the pool.

    Among other things, DiaGrid has been used to study the molecular structure of viruses; probe the Solar System’s formation; project the reliability of Indiana’s electrical supply; model the spread of water pollutants; and identify millions of potential zeolites, common catalysts in chemical reactions.

    Writer: Greg Kline, science and technology writer, Information Technology at Purdue, 765-494-8167 (office), 765-426-8545 (cell), gkline@purdue.edu

Originally posted: November 29, 2011