Current REU Projects
In each research project, students will work closely with a member of our staff. The projects will be in a wide variety of areas, including data analytics, high performance computing, DevOps, AI, and containerization.
Project #1: Streamlined Software: Automating user-requested software deployment on Anvil using agile technologies
Simplifying Software Deployment: Empowering users to install complex scientific applications on Anvil with the click of a button! Scientific applications often come with dozens or even hundreds of dependencies with complex build instructions. For example, large software like Tensorflow or Pytorch have hundreds of dependencies and are non-trivial to build from source. The intent of this project is to craft a CI/CD pipeline on Anvil to facilitate scientific software deployment using the Spack package manager. The REU will work on creating web-based tools to automate Spack configuration, software deployment, and testing. As part of this project, we will explore and extend the E4S framework developed by the Exascale Computing Project (ECP). Imagine the possibilities: creating a system that not only eases the burden on staff but also unleashes the full potential of user-driven software deployment.
Requirements: A basic understanding or familiarity with Linux command line tools and/or experience of shell scripting required.
Past experience with Python and compiling open-source software is a plus, but not strictly necessary if candidates can demonstrate successful completion of relevent trainings prior to the start of the REU program:
Project #2: Unlocking the Impact of Data: The Power of the Dashboard
Empowering Users: Breaking Down Barriers to Data with an Intuitive Dashboard! Imagine effortlessly tapping into powerful metrics that can help you understand how you are using your computational resources; or where you can wring out better performance without any coding or command-line confusion. In this project, you will break down barriers to data, making complex data more accessible, and transforming the way we handle user resource allocation and code performance information on Anvil! Interested in data analysis, creating user-friendly dashboards, and empowering users to make sound data driven decisions? Then this is the project for you.
Requirements: Basic understanding of Linux system, commands and/or file systems, basic understanding/familiarity with at least one programming language (e.g. Python, C++, Ruby, etc...), and basic understanding/familiarity/experience with data presentation and interpretation.
Past experience with the following skills are a plus, but not strictly necessary: Interactive application/widget development.
Students must be able to complete the following trainings prior to the start of the REU program (trainings are all accessible through RCAC website):
Project #3: Gromacs Gateway: Creating User-friendly Molecular Simulations Online
Dive into the world of molecular dynamics with us as we tackle the challenges that await creating easy simulation solutions. Gromacs, the powerhouse for molecular simulations, often feels like a maze with its reliance on Linux commands and a missing link for visualizing structures. This project sets an eye toward integrating Gromacs seamlessly with the Anvil OOD portal. No more grappling with Linux commands or hunting for external tools. This projects goal is to start the process of creating a one-stop-shop for Anvil users Gromacs needs. Perfect for students looking to make waves in computational chemistry through HPC!
Requirements: Familiar with basic Linux commands and environments and a basic understanding of, or ability to, read programing languages like bash or ruby is required.
Past experience with the following skills are a plus, but not strictly necessary if candidate can demonstrate successful completion of a relevant tutorial prior to the start of the REU program:
Project #4: AI-Powered Operational Data Analytics: Enhancing User Experience on Anvil
Picture this: A shared HPC cluster where everyone's racing for computing resources. Your job's in the queue, but the wait time feels like forever. Frustrating, right? This project is diving into the world of operational data analytics, using AI and machine learning to create a smart model to predict job queuing times on Anvil. Our mission? Predicting how long a user will be in the queue! This project is not just crunching numbers but turning messy data into a replicable and explainable model for all users. Once the model is created the project will focus on building a command line tool and integrating it into Anvil's job submission functions so users can automatically see wait time predictions from the model upon job submission, allowing them to plan for when their job will kick off.
Requirements Candidate must be familiar with foundational data analysis methods and frameworks, including statistical analysis, data cleaning, and manipulating data using standard libraries like Pandas.
Past experience with machine learning is a plus, but not strictly necessary if candidate can demonstrate successful completion of a relevant class like Coursera’s Applied Machine Learning in Python course prior to the start of the REU program.
Past REU Projects
Projects focused on a wide variety of areas, including data analytics, high performance computing, DevOps, and containerization.
Project #1: Integrate the XALT job-level usage activity monitoring tool into XDMoD reporting for deeper analysis of workloads.
Project #2: Implement direct cloud burst from Anvil to Azure for HPC and accelerator workloads based on the work with Microsoft in 2022.
Project #3: Connect the Anvil composable subsystem’s Rancher management platform to Azure Kubernetes Service to support elastic-scaling of workloads for science gateway applications.
Project #4: Develop deployment solutions for the Jupyter notebook interactive computing platform on Anvil’s composable subsystem for education and training activities.
Projects focused on improving data collection and reporting of the Anvil cluster to ensure quality of service, effective system utilization, and performance.
While benchmarking compute and storage performance, students:
- Learned concepts surrounding benchmarking of HPC systems, including HPL, HPCG, STREAM, and IO500
- Measured performance of Anvil's compute nodes, GPU nodes, scratch, project, and BeeOND filesystems
- Established baselines for system performance for a continuous measurement framework
To improve data collection and reporting, students:
- Configured multiple modules on Anvil's XDMoD instance and improved data collection and reporting of system metrics
- PCP, SUPREMM
- Open OnDemand data collection
- Continuous measurement framework via Application Kernels
For system environmental measurements, students:
- Created monitoring of Anvil's power consumption at the node and rack levels using Prometheus and Grafana
Enhance Anvil's cybersecurity posture, utilizing an NSF CICI-funded intrusion detection system to monitor aspects of Anvil's network and enable visualization-driven insights into network traffic and cybersecurity alerts. Students helped with:
- Visualization of network traffic trends on Anvil and Purdue networks
- Dashboard development based on Zeek IDS protocol logs (TCD, UDP, SSH, HTTP)
- Port scanning
- Per-host traffic visualization
Nondiscrimination Policy Statement
Purdue University prohibits discrimination against any member of the University community on the basis of race, religion, color, sex, age, national origin or ancestry, genetic information, marital status, parental status, sexual orientation, gender identity and expression, disability, or status as a veteran. The University will conduct its programs, services and activities consistent with applicable federal, state and local laws, regulations and orders and in conformance with the procedures and limitations as set forth in Purdue’s Equal Opportunity, Equal Access and Affirmative Action policy which provides specific contractual rights and remedies. Additionally, the University promotes the full realization of equal employment opportunity for women, minorities, persons with disabilities and veterans through its affirmative action program. View a more complete statement of Purdue's policies of equal access and equal opportunity. If you have any questions or concerns regarding these policies, please contact the Office of the Vice President for Ethics and Compliance at email@example.com or 765-494-5830.