Table of Contents
Fetching ...

Exploring the Potential of Carbon-Aware Execution for Scientific Workflows

Kathleen West, Fabian Lehmann, Vasilis Bountris, Ulf Leser, Yehia Elkhatib, Lauritz Thamsen

TL;DR

The paper investigates carbon-aware execution for scientific workflows by evaluating two real Nextflow workflows (Chip-Seq and Rangeland) against both average and marginal carbon intensity signals. It introduces a baseline energy footprint and then assesses three carbon-aware strategies: entire workflow shifting, interrupted shifting, and resource scaling, including compute-resource selection and processor frequency adjustments. Key findings show substantial emission reductions for Chip-Seq, up to about $64$–$66$% depending on the CI signal and window, with interrupted shifting offering robust gains at modest overheads; Rangeland also benefits but to a lesser extent. The work demonstrates practical carbon-reduction potential for delay-tolerant, interruptible, and scalable workflows and contributes replication-ready methods and data for broader adoption in HPC settings.

Abstract

Scientific workflows are widely used to automate scientific data analysis and often involve processing large quantities of data on compute clusters. As such, their execution tends to be long-running and resource intensive, leading to significant energy consumption and carbon emissions. Meanwhile, a wealth of carbon-aware computing methods have been proposed, yet little work has focused specifically on scientific workflows, even though they present a substantial opportunity for carbon-aware computing because they are inherently delay tolerant, efficiently interruptible, and highly scalable. In this study, we demonstrate the potential for carbon-aware workflow execution. For this, we estimate the carbon footprint of two real-world Nextflow workflows executed on cluster infrastructure. We use a linear power model for energy consumption estimates and real-world average and marginal CI data for two regions. We evaluate the impact of carbon-aware temporal shifting, pausing and resuming, and resource scaling. Our findings highlight significant potential for reducing emissions of workflows and workflow tasks.

Exploring the Potential of Carbon-Aware Execution for Scientific Workflows

TL;DR

The paper investigates carbon-aware execution for scientific workflows by evaluating two real Nextflow workflows (Chip-Seq and Rangeland) against both average and marginal carbon intensity signals. It introduces a baseline energy footprint and then assesses three carbon-aware strategies: entire workflow shifting, interrupted shifting, and resource scaling, including compute-resource selection and processor frequency adjustments. Key findings show substantial emission reductions for Chip-Seq, up to about % depending on the CI signal and window, with interrupted shifting offering robust gains at modest overheads; Rangeland also benefits but to a lesser extent. The work demonstrates practical carbon-reduction potential for delay-tolerant, interruptible, and scalable workflows and contributes replication-ready methods and data for broader adoption in HPC settings.

Abstract

Scientific workflows are widely used to automate scientific data analysis and often involve processing large quantities of data on compute clusters. As such, their execution tends to be long-running and resource intensive, leading to significant energy consumption and carbon emissions. Meanwhile, a wealth of carbon-aware computing methods have been proposed, yet little work has focused specifically on scientific workflows, even though they present a substantial opportunity for carbon-aware computing because they are inherently delay tolerant, efficiently interruptible, and highly scalable. In this study, we demonstrate the potential for carbon-aware workflow execution. For this, we estimate the carbon footprint of two real-world Nextflow workflows executed on cluster infrastructure. We use a linear power model for energy consumption estimates and real-world average and marginal CI data for two regions. We evaluate the impact of carbon-aware temporal shifting, pausing and resuming, and resource scaling. Our findings highlight significant potential for reducing emissions of workflows and workflow tasks.

Paper Structure

This paper contains 11 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: Reduction in carbon footprint of workflows using interrupted temporal shifting over 12--192 hour windows.
  • Figure 2: Impact of resource scaling on the alignment of trimgalore execution with marginal CI in California (a) and the Netherlands (b).