Purdue, Globus partnership gives researchers an easy way to move big data to and from campus

September 25, 2014

Purdue doctoral student Archana Tankasala is studying electron interactions in few-electron systems involving 1.5 to 2 million atoms as part of research that could contribute to development of quantum computers exponentially more capable than today’s supercomputers.

To do her research, Tankasala, a member of electrical and computer engineering Professor Gerhard Klimeck’s lab, uses Purdue’s community cluster supercomputers, among the most powerful available to researchers on any single campus in the nation. But that’s not all. She also uses one of the most powerful supercomputers available to researchers nationwide, the Blue Waters supercomputer at the National Center for Supercomputing Applications.

Blue Waters is in Illinois, however, and simulating systems with 1.5 million to 2 million atoms makes for a lot of data to transfer back and forth from Purdue, 702 gigabytes in one case. Tankasala has found a way to make that job pretty much painless by using Globus, which bills itself as something like Dropbox for scientists and offers an easy, fast, secure way to move and share large research data sets.

A new agreement between Globus and Purdue now gives Purdue researchers a Purdue-specific Globus website for access. Users can set up a Globus account through the Purdue Globus portal and then sign in with their Purdue user name and password. The service is open to Purdue faculty, staff and students who need to transfer large amounts of research data to and from campus, or share data with collaborators around the country or around the globe.

“The Purdue Globus portal gives us an easy place for Purdue users to start,” says Preston Smith, manager of research support for ITaP Research Computing (RCAC). “They can log in with their Purdue career accounts and off they go.”

Tankasala says she can simply submit a batch of data to Globus and then concentrate on other things. Globus sends her an email when the job is done.

“The user interface is really friendly,” she says. “It didn’t take long to figure out how to proceed with a transfer, and it is really fast.”

At Purdue, the system takes advantage of the University’s upgraded, faster campus research network and its high-speed connection to fast national and international research networks like Internet2.

Purdue’s Globus service also is integrated with the Research Data Depot, a new state-of-the-art research data storage system from ITaP designed, configured and operated for the needs of Purdue researchers.

Work also is planned to integrate the Purdue Globus service with the Fortress archive data storage system ITaP Research Computing (RCAC) operates for researchers who store data and results on a long-term basis.

While ITaP has always offered research data storage for computational and archival purposes in connection with the Community Cluster Program supercomputers, the Research Data Depot makes available over 2 petabytes of storage to any Purdue research group or campus unit in need of a high-capacity central solution for storing large, active research data sets at a competitive price.

At a cost of $150 per terabyte annually, a research group can purchase or expand storage space in the Research Data Depot as need arises. Researchers and campus units can buy capacity in the Research Data Depot through the “Purchase” link atop the ITaP Research Computing (RCAC) website, www.rcac.purdue.edu.

Through an easy-to-use Web interface, space is configurable by faculty for use on a project or research group basis and in on-campus and off-campus collaborations. Data is protected from accidental deletion and mirrored at two sites on different ends of campus to ensure rapid access, high reliability and recoverability.

For more information, contact Preston Smith, psmith@purdue.edu or 49-49729.

Originally posted: September 25, 2014  3:04pm