Big data research focus of free workshop Feb. 2-3

  • February 2, 2021 11:00am - February 3, 2021 5:00pm EST
  • Online
  • Events

Purdue will host a free two-day big data workshop focusing on topics such as Hadoop and Spark on Tuesday, Feb. 2 and Wednesday, Feb. 3, for faculty, staff and students who want to learn more about tools for working with large sets of data. While similar workshops have been held in the Envision Center in the past, this workshop will be delivered remotely using Zoom due to COVID-19.

There is no cost to register, but interested participants must register by noon on Friday, January 29. The event is sponsored by ITaP and the National Science Foundation Extreme Science and Engineering Discovery Environment (XSEDE), as well as the Pittsburgh Supercomputing Center.

The second day’s subject matter will build on the first day, so participants should attend both sessions. Participants should use a computer with a terminal client such as PuTTY or MobaXterm installed.

Participants register with XSEDE, in which Purdue is a partner. A free XSEDE account can be created on the XSEDE user portal at Once you have an account, you can register for the online session. More information is available here.

The mini course covers the basics of using Hadoop and Spark in a Linux environment, as well as machine learning with Spark and deep learning with TensorFlow. There are no prerequisites, but familiarity with Python will be helpful.

The supercomputers that ITaP Research Computing makes available to Purdue researchers through the Community Cluster Program support software that allows a user to deploy high-performance Hadoop clusters on demand.

This big data workshop is part of a series of high-performance computing training sessions ITaP sponsors at Purdue through XSEDE, says Eric Adams, who coordinates training for ITaP Research Computing.

Originally posted: January 28, 2021 9:38am EST