Table of Contents
Fetching ...

Parsl+CWL: Towards Combining the Python and CWL Ecosystems

Nishchay Karle, Ben Clifford, Yadu Babuji, Ryan Chard, Daniel S. Katz, Kyle Chard

TL;DR

This paper presents Parsl+CWL, an integration that enables Python-based Parsl workflows to import and execute CWL CommandLineTool definitions, leveraging CWL's portable tool descriptions within Parsl's scalable execution model. It introduces CWLApp wrappers to convert CWL tools into Parsl apps and a parsl-cwl runner for direct execution of CommandLineTools, along with a prototype InlinePythonRequirement to embed Python expressions in CWL workflows. The authors demonstrate a three-step image-processing pipeline implemented in both CWL and Parsl, and show that Parsl-CWL achieves competitive performance relative to existing CWL runners. A key contribution is the exploration of Python-based expressions in CWL, which enhances workflow expressiveness, validation, and dynamic behavior, potentially accelerating adoption of CWL tools in the Python-centric scientific computing ecosystem.

Abstract

The Common Workflow Language (CWL) is a widely adopted language for defining and sharing computational workflows. It is designed to be independent of the execution engine on which workflows are executed. In this paper, we describe our experiences integrating CWL with Parsl, a Python-based parallel programming library designed to manage execution of workflows across diverse computing environments. We propose a new method that converts CWL CommandLineTool definitions into Parsl apps, enabling Parsl scripts to easily import and use tools represented in CWL. We describe a Parsl runner that is capable of executing a CWL CommandLineTool directly. We also describe a proof-of-concept extension to support inline Python in a CWL workflow definition, enabling seamless use in the Python ecosystem of Parsl. We demonstrate the benefits of this integration by presenting example CWL CommandLineTool definitions that show how they can be used in Parsl, and comparing performance of executing an image processing workflow using the Parsl integration and other CWL runners.

Parsl+CWL: Towards Combining the Python and CWL Ecosystems

TL;DR

This paper presents Parsl+CWL, an integration that enables Python-based Parsl workflows to import and execute CWL CommandLineTool definitions, leveraging CWL's portable tool descriptions within Parsl's scalable execution model. It introduces CWLApp wrappers to convert CWL tools into Parsl apps and a parsl-cwl runner for direct execution of CommandLineTools, along with a prototype InlinePythonRequirement to embed Python expressions in CWL workflows. The authors demonstrate a three-step image-processing pipeline implemented in both CWL and Parsl, and show that Parsl-CWL achieves competitive performance relative to existing CWL runners. A key contribution is the exploration of Python-based expressions in CWL, which enhances workflow expressiveness, validation, and dynamic behavior, potentially accelerating adoption of CWL tools in the Python-centric scientific computing ecosystem.

Abstract

The Common Workflow Language (CWL) is a widely adopted language for defining and sharing computational workflows. It is designed to be independent of the execution engine on which workflows are executed. In this paper, we describe our experiences integrating CWL with Parsl, a Python-based parallel programming library designed to manage execution of workflows across diverse computing environments. We propose a new method that converts CWL CommandLineTool definitions into Parsl apps, enabling Parsl scripts to easily import and use tools represented in CWL. We describe a Parsl runner that is capable of executing a CWL CommandLineTool directly. We also describe a proof-of-concept extension to support inline Python in a CWL workflow definition, enabling seamless use in the Python ecosystem of Parsl. We demonstrate the benefits of this integration by presenting example CWL CommandLineTool definitions that show how they can be used in Parsl, and comparing performance of executing an image processing workflow using the Parsl integration and other CWL runners.

Paper Structure

This paper contains 18 sections, 2 figures.

Figures (2)

  • Figure 1: Runtimes for CWL image processing workflow using CWLTool, Toil and Parsl-CWL on three nodes and one node
  • Figure 2: Runtime for CWL InlineJavaScript processing using CWLTool, Toil and InlinePython using Parsl-CWL as we increase number of words from 2 to 1024