January 8, 2013
In January 2011, the National Science Foundation began requiring all grant proposals to include two-page plans that describe what data will be generated in the research and how the data will be managed and shared. Other funding agencies such as the National Institutes of Health, NASA and the National Endowment for the Humanities have their own data management requirements.
The Purdue University Research Repository (PURR) was created to support researchers in meeting these requirements by building a platform for collaborating on research and publishing and archiving datasets. PURR is currently available for broad use by Purdue faculty, staff and graduate students.
Examples of research data include software source code, output from sensors and instruments, interview transcripts, observation logs, spreadsheets, databases, scientific images and video, and more.
Purdue faculty, graduate students and staff can create projects on the PURR website, invite others to join their projects and receive a free allocation of storage and tools for helping them collaborate with and manage their research data.
“Scholars often publish their findings in conference and journal papers, but without the supporting data, the research can’t be reproduced and verified by others,” says Courtney Matthews, digital data repository specialist at the Purdue Libraries. “PURR gives Purdue researchers a platform for managing and publishing their datasets in a way that meets funder requirements and enables the reuse of data with credit given to the researcher.”
It also provides boilerplate text that can be pasted into grant proposals as well as tutorials and support for developing effective data management plans.
Since its launch, PURR has been included in more than 500 grant proposals originating from Purdue.
Datasets published and archived in PURR are assigned Digital Object Identifiers (DOIs) that uniquely identify them and make them more easily tracked and cited. David Gleich, an assistant professor of Computer Science, recently used PURR to publish a dataset for testing algorithms in social network analysis. “DOIs make it easy to track citations, usage and other metrics,” says Gleich. “It’s always important to be able to demonstrate research impact.”
For eight years, agronomy Professor Jeffrey Volenec and colleagues collected data from 96 farm plots to better understand how potassium and phosphorus levels influence the growth of alfalfa. With the study over, the question became what to do with all that data.
That concern prompted Volenec to be one of the first users of PURR. “It’s unlikely to be done anytime soon by anyone else so we thought this type of data ought to be preserved,” Volenec says. “It was bought mainly with tax dollars. The data, the numbers, belong to the people.”
Datasets are archived for a minimum of 10 years, after which time they are managed as a collection of the University’s libraries. PURR was designed to implement open standards and best practices, such as the ISO 16363 certification of trustworthy digital repositories, for which an audit process is currently under way.
The Purdue Libraries, the Office of the Vice President for Research and ITaP jointly developed PURR. The service is based on HUBzero, which also was developed at Purdue.