Pegasus on Anvil

Link to section 'Overview' of 'Pegasus on Anvil' Overview

Pegasus is a scientific workflow management system that helps users define, manage, and execute complex computational workflows on high-performance computing (HPC) systems. A workflow in Pegasus consists of multiple tasks with defined dependencies, and Pegasus handles job submission, data staging, execution ordering, and failure recovery.

On Purdue Anvil, Pegasus enables users to develop and run workflows that execute on Anvil’s compute resources through the SLURM batch scheduler. Pegasus is provided to Anvil users via a preconfigured Jupyter Notebook environment, allowing interactive workflow development, testing, and submission without requiring a separate local installation.

Link to section 'Pegasus Deployment' of 'Pegasus on Anvil' Pegasus Deployment

Pegasus is deployed on Anvil through the Anvil Notebook Service, which provides browser-based access to Jupyter Notebooks running on Anvil infrastructure. The Pegasus Notebook environment includes the Pegasus workflow management system, HTCondor for workflow execution management, and preconfigured integration with Anvil’s SLURM scheduler.

This environment allows users to develop and debug workflows interactively using the Pegasus Python API or command-line tools, submit workflows to Anvil’s batch system using their allocations, and monitor workflow execution and logs directly from the notebook interface. No additional Pegasus installation or configuration is required by the user.

Link to section 'Accessing Pegasus' of 'Pegasus on Anvil' Accessing Pegasus

To access Pegasus on Anvil:

  1. Log in to the Anvil Notebook Service using your ACCESS credentials.
  2. Select the Pegasus Notebook from the list of available notebook environments.
  3. Choose an Anvil allocation to associate with your notebook session.

Once the notebook is launched, Pegasus is ready to use within the notebook environment.

Link to section 'Example Workflows' of 'Pegasus on Anvil' Example Workflows

Pegasus example workflows suitable for HPC environments are available in the Pegasus example repository. This repository contains Jupyter Notebook–based examples that demonstrate how to define, configure, and execute Pegasus workflows on batch-scheduled systems.

The examples cover common workflow patterns and Pegasus features, including workflow construction, task dependencies, data management, and job submission and monitoring. These notebooks can be used as starting points for developing and testing Pegasus workflows within the Anvil Notebook environment.

Helpful?

Thanks for letting us know.

Please don't include any personal information in your comment. Maximum character limit is 250.
Characters left: 250
Thanks for your feedback.