Table of Contents
Fetching ...

A Systematic Evaluation of the Potential of Carbon-Aware Execution for Scientific Workflows

Kathleen West, Youssef Moawad, Fabian Lehmann, Vasilis Bountris, Ulf Leser, Yehia Elkhatib, Lauritz Thamsen

TL;DR

This paper investigates how carbon-aware execution can reduce the energy footprint of scientific workflows. It systematically evaluates load shifting, interruption-based scheduling, and resource scaling using seven real-world Nextflow workflows across multiple regions and carbon-intensity signals (average and marginal CI). The study finds that, under ideal conditions, temporal shifting can cut operational emissions by over 80%, interruptions can yield 30–70% savings in short windows, and resource scaling can achieve up to 67% reductions, with regional variability. The results highlight the potential of aligning workflow execution with low-carbon energy, while also outlining practical considerations, trade-offs, and directions for making these techniques workable in real-world systems.

Abstract

Scientific workflows are widely used to automate scientific data analysis and often involve computationally intensive processing of large datasets on compute clusters. As such, their execution tends to be long-running and resource-intensive, resulting in substantial energy consumption and, depending on the energy mix, carbon emissions. Meanwhile, a wealth of carbon-aware computing methods have been proposed, yet little work has focused specifically on scientific workflows, even though they present a substantial opportunity for carbon-aware computing because they are often significantly delay tolerant, efficiently interruptible, highly scalable and widely heterogeneous. In this study, we first exemplify the problem of carbon emissions associated with running scientific workflows, and then show the potential for carbon-aware workflow execution. For this, we estimate the carbon footprint of seven real-world Nextflow workflows executed on different cluster infrastructures using both average and marginal carbon intensity data. Furthermore, we systematically evaluate the impact of carbon-aware temporal shifting, and the pausing and resuming of the workflow. Moreover, we apply resource scaling to workflows and workflow tasks. Finally, we report the potential reduction in overall carbon emissions, with temporal shifting capable of decreasing emissions by over 80%, and resource scaling capable of decreasing emissions by 67%.

A Systematic Evaluation of the Potential of Carbon-Aware Execution for Scientific Workflows

TL;DR

This paper investigates how carbon-aware execution can reduce the energy footprint of scientific workflows. It systematically evaluates load shifting, interruption-based scheduling, and resource scaling using seven real-world Nextflow workflows across multiple regions and carbon-intensity signals (average and marginal CI). The study finds that, under ideal conditions, temporal shifting can cut operational emissions by over 80%, interruptions can yield 30–70% savings in short windows, and resource scaling can achieve up to 67% reductions, with regional variability. The results highlight the potential of aligning workflow execution with low-carbon energy, while also outlining practical considerations, trade-offs, and directions for making these techniques workable in real-world systems.

Abstract

Scientific workflows are widely used to automate scientific data analysis and often involve computationally intensive processing of large datasets on compute clusters. As such, their execution tends to be long-running and resource-intensive, resulting in substantial energy consumption and, depending on the energy mix, carbon emissions. Meanwhile, a wealth of carbon-aware computing methods have been proposed, yet little work has focused specifically on scientific workflows, even though they present a substantial opportunity for carbon-aware computing because they are often significantly delay tolerant, efficiently interruptible, highly scalable and widely heterogeneous. In this study, we first exemplify the problem of carbon emissions associated with running scientific workflows, and then show the potential for carbon-aware workflow execution. For this, we estimate the carbon footprint of seven real-world Nextflow workflows executed on different cluster infrastructures using both average and marginal carbon intensity data. Furthermore, we systematically evaluate the impact of carbon-aware temporal shifting, and the pausing and resuming of the workflow. Moreover, we apply resource scaling to workflows and workflow tasks. Finally, we report the potential reduction in overall carbon emissions, with temporal shifting capable of decreasing emissions by over 80%, and resource scaling capable of decreasing emissions by 67%.

Paper Structure

This paper contains 59 sections, 14 figures, 9 tables.

Figures (14)

  • Figure 1: A scientific workflow formed of seven tasks.
  • Figure 2: Daily mean average carbon intensity per month in 2024 for the seven regions we studied: Great Britain, Germany, California (USA), Texas (USA), South Africa, Tokyo (Japan), and New South Wales (Australia).
  • Figure 3: A comparison of average and marginal CI signals for Northern Texas between two days, 20th and 27th of January 2023.
  • Figure 4: Reduction using entire workflow shifting in Great Britain
  • Figure 5: Reduction using entire workflow shifting in South Africa
  • ...and 9 more figures