Purdue team offers biomedical sciences big data science boot camp using ITaP Research Computing resources
An interdisciplinary team of Purdue faculty is hosting its third annual boot camp to teach researchers in the biomedical sciences the fundamentals of big data science using Purdue’s Rice research supercomputer.
“There’s this huge population of people who have the data in their own lab, but they don’t know what to do with it,” says Min Zhang, a professor of statistics and the principal investigator on a National Institutes of Health grant that supports the project.
With co-PIs James Fleet, distinguished professor of nutrition science, and Wanqing Liu, adjunct professor of medicinal chemistry and molecular pharmacology, as well as Pete Pascuzzi, assistant professor in the Purdue University Libraries, Zhang organized a boot camp-style course on Purdue’s campus in 2016 and 2017. It covered topics such as using public databases and tools, data visualization and next generation sequencing techniques to address research questions in biomedical science. Another boot camp will be held this summer.
The boot camp is aimed at faculty, postdoctoral researchers and advanced graduate students. Zhang and her team were funded to meet the needs of researchers in the Midwest but they’ve had students from all corners of the nation.
The organizers know that they can’t possibly teach everything there is to know about computational biology in a two-week mini-course. However, they designed their course to bring down many of the barriers that limit the use of big data approaches by biomedical researchers. By doing this, Zhang and her colleagues hope to open up the field to researchers who have more traditional biochemistry or biomedical backgrounds.
Feedback from past participants shows they’re succeeding in their mission.
“We took people who, as a general rule, felt they had limited ability to use the computer and do statistics related to this kind of science,” says Fleet. “And by six months out, at least half the people had made an active effort to start working on new training and were applying for grants that included some of the skills that we’d taught them.”
Having access to dedicated nodes on Rice allowed instructors to use the cluster interactively, so boot camp participants could wrestle with their data in hands-on exercises, in addition to attending lectures and watching demonstrations. ITaP Research Computing staff members, including senior scientific applications analysts Gladys Andino and Steve Kelley, helped with classroom instruction.
“We’re very appreciative of how ITaP has helped us do this, and we couldn’t do it without them,” says Fleet.
While supercomputing power isn’t necessary for everything the boot camps cover, the organizers are pleased that – thanks to the ITaP collaboration – participants are exposed to the potential ways high-performance computing may benefit their research. They’re hopeful that participants will return to their own campuses and seek out computational resources available to them there, while also passing on what they’ve learned to others.