Workflow Automation Tools for Many-Task Computing
Historically, high-performance computing (HPC), has primarily involved tightly coupled simulations executed in a synchronous fashion across nodes. More recently, other paradigms within research computing have become common where the workload is not a single large coupled task, but instead very many modest tasks with little to no dependency on each other. These could be data processing and analysis tasks, machine-learning experiments, bioinformatics tasks, or parameter searches in a calculation.
In these many-task, high-throughput computing (HTC) workloads, researchers often attempt some form of parallelization within their application language (e.g., Python, R, or MATLAB ®). This is tedious, fraught with difficulty for most users, lacking flexibility, and is a distraction from the research concerns of the project. It would be better to use a distinct workflow automation tool to manage the many individual tasks.
This workshop outlines this paradigm on a traditional SLURM cluster along with several solutions. Different tools and techniques are discussed in an escalating fashion covering various features and pitfalls. Finally, a demonstration of the HyperShell utility will showcase the level of sophistication possible both for individuals and research groups in managing task execution. (hypershell.readthedocs.io)
Schedule:
The Events page will show the next scheduled session.
Prerequisites:
- Basic Linux/Unix command-line knowledge is required (Unix 101 and Unix 201).
- Clusters 101 is strongly recommended but not required.
Topics:
- Intro to concepts in high-throughput computing.
- How the scheduler helps and where it falls short.
- Overview of various workflow automation tools.
- Overview of the HyperShell utility.