Skip to main content

Data Science and Machine Learning Training

RCAC offers training on a variety of data science and machine learning topics on a regular cadence.  These trainings are intended to offer an applied view of data science to help researchers effectively utilize such techniques and the clusters to further their work.

If just getting started on the clusters, users are encouraged to check out our general computing training resources as well.

Foundational Data Science and Machine Learning Topics

  • Data 101 for Machine Learning gives an introduction about how data is used for machine learning covering data collection, pre-processing, and exploratory data analysis.
  • Time Series Forecasting 101 and 201 explore Machine Learning and Deep Learning techniques to analyze and forecast time series data in high-performance computing environments.
  • Natural Language Processing 101 provides an introduction to core concepts of modern NLP to help users get started with incorporating NLP into their work.
  • Data Visualization Techniques and Tools introduces essential concepts and hands-on experience in data visualization.
  • Purdue is also pleased to partner with Nvidia Deep Learning Institute to regularly offer free full-day Fundamentals of Deep Learning workshops
  • Generative AI Series covers topics in generative AI, including prompt engineering, architeture, and custom models. 
  • Introduction to R covers core aspects of R programming, covering its syntax, data types, and basic data manipulation and data visualization techniques in R. 

For additional training on technical concepts, we encourage users to explore trainings offered by Purdue libraries.

Large-Scale Data Science and Related Topics

  • Jupyter Kernels and HPC is an intermediate level discussion of how the Jupyter system (JupyterHub, Jupyter Notebook, etc.) function at an application level on a distributed computing cluster.
  • Python Package Installation covers how to install scientific Python applications on RCAC clusters.
  • Research Storage Overview covers RCAC storage resources and services, typical use cases, access methods and data management strategies and workflows.
  • [Coming Soon!] Accelerated Machine Learning with GPUs