Table of Contents
Fetching ...

Workflow Mini-Apps: Portable, Scalable, Tunable & Faithful Representations of Scientific Workflows

Ozgur Ozan Kilic, Tianle Wang, Matteo Turilli, Mikhail Titov, Andre Merzky, Line Pouchard, Shantenu Jha

TL;DR

The paper addresses the challenge of designing and evaluating complex scientific workflows across heterogeneous HPC platforms by introducing workflow mini-apps. It presents a methodology to derive emulated tasks via a tunable, portable wfMiniAPI, and to assemble these into full workflow mini-apps using middleware such as RADICAL-Cybertools. Using two real-world workflows—the Inverse Problem and DeepDriveMD—the authors demonstrate fidelity in makespan, I/O, and resource utilization, portability across Polaris, Summit, and Frontier, and improved performance reproducibility at reduced cost. The work advances workflow engineering by providing a publishable API and a configurable framework to study scalability and execution-models without deploying full-scale, resource-heavy workflows.

Abstract

Workflows are critical for scientific discovery. However, the sophistication, heterogeneity, and scale of workflows make building, testing, and optimizing them increasingly challenging. Furthermore, their complexity and heterogeneity make performance reproducibility hard. In this paper, we propose workflow mini-apps as a tool to address the challenges in building and testing workflows while controlling the fidelity of representing realworld workflows. Workflow mini-apps are deployed and run on various HPC systems and architectures without workflow-specific constraints. We offer insight into their design and implementation, providing an analysis of their performance and reproducibility. Workflow mini-apps thus advance the science of workflows by providing simple, portable, and managed (fidelity) representations of otherwise complex and difficult-to-control real workflows.

Workflow Mini-Apps: Portable, Scalable, Tunable & Faithful Representations of Scientific Workflows

TL;DR

The paper addresses the challenge of designing and evaluating complex scientific workflows across heterogeneous HPC platforms by introducing workflow mini-apps. It presents a methodology to derive emulated tasks via a tunable, portable wfMiniAPI, and to assemble these into full workflow mini-apps using middleware such as RADICAL-Cybertools. Using two real-world workflows—the Inverse Problem and DeepDriveMD—the authors demonstrate fidelity in makespan, I/O, and resource utilization, portability across Polaris, Summit, and Frontier, and improved performance reproducibility at reduced cost. The work advances workflow engineering by providing a publishable API and a configurable framework to study scalability and execution-models without deploying full-scale, resource-heavy workflows.

Abstract

Workflows are critical for scientific discovery. However, the sophistication, heterogeneity, and scale of workflows make building, testing, and optimizing them increasingly challenging. Furthermore, their complexity and heterogeneity make performance reproducibility hard. In this paper, we propose workflow mini-apps as a tool to address the challenges in building and testing workflows while controlling the fidelity of representing realworld workflows. Workflow mini-apps are deployed and run on various HPC systems and architectures without workflow-specific constraints. We offer insight into their design and implementation, providing an analysis of their performance and reproducibility. Workflow mini-apps thus advance the science of workflows by providing simple, portable, and managed (fidelity) representations of otherwise complex and difficult-to-control real workflows.
Paper Structure (16 sections, 1 equation, 9 figures, 4 tables)

This paper contains 16 sections, 1 equation, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Design process of a workflow mini-app
  • Figure 2: The two workflows used in the Inverse Problem. Here, we take the #phase to be three as an example. The number inside each task represents the phase-id the task belongs to, and the dashed line is the boundary of different stages, while tasks in different stages can not run in parallel. The arrow represents task dependency: A Task will not start until all tasks that point to it are finished. wang2023parallel
  • Figure 3: This figure shows the DeepDriveMD overview for synchronous execution model brace2022coupling.
  • Figure 4: The CPU and GPU resource utilization (RU) as a function of execution time of the original workflow and the workflow mini-app for DeepDriveMD (column 1) and Inverse Problem with an execution model of CPU+GPU serial (column 2 for configuration V2 and column 3 for configuration V3, respectively). The top three figures show the RU of original workflows and the bottom three figures show the RU of workflow mini-apps. We compare the original workflow with the workflow mini-app for each experiment configuration to validate the fidelity of the workflow mini-apps. Note that CPUs and GPUs in y-axis refer to CPU and GPU id respectively.
  • Figure 5: The I/O performance of the original workflow and the workflow mini-app for DeepDriveMD (column 1) and Inverse Problem with an execution model of CPU+GPU serial (column 2 for configuration V2 and column 3 for configuration V3, respectively). Each plot shows the total read/write size of each task in that workflow during their runtime: Each line segment has its left and right end representing the start and end time of a single emulated task, and its y-value being the size of I/O of that emulated task. The top three figures show the original workflow, and the bottom three figures show the workflow mini-app. We compare the original workflow with the workflow mini-app for each experiment configuration to validate the fidelity of the workflow mini-apps.
  • ...and 4 more figures