MaRDIFlow: A CSE workflow framework for abstracting meta-data from FAIR computational experiments
Pavan L. Veluvali, Jan Heiland, Peter Benner
TL;DR
The paper tackles the reproducibility and interoperability challenges of metadata-rich computational experiments by introducing MaRDIFlow, a CSE workflow framework that abstracts metadata within an ontology of mathematical objects and encapsulates execution/environment dependencies in multi-layered descriptions. The approach treats workflow components as interchangeable I/O objects, enabling redundancy across different realizations to enhance robustness and reuse. A working CLI prototype demonstrates how MaRDIFlow can document, execute, and provenance-track FAIR-compliant computational experiments, illustrated with a CO2 methanization reactor model and a 2D Cahn–Hilliard spinodal decomposition example. Contributions include a formal multi-layered abstraction for CSE workflows, integration of provenance, and a path toward an electronic lab notebook (ELN) to visualize and execute MaRDIFlow-enabled experiments. This framework aims to improve Findability, Accessibility, Interoperability, and Reusability of abstracted workflow components across heterogeneous platforms and use cases in mathematical sciences.
Abstract
Numerical algorithms and computational tools are instrumental in navigating and addressing complex simulation and data processing tasks. The exponential growth of metadata and parameter-driven simulations has led to an increasing demand for automated workflows that can replicate computational experiments across platforms. In general, a computational workflow is defined as a sequential description for accomplishing a scientific objective, often described by tasks and their associated data dependencies. If characterized through input-output relation, workflow components can be structured to allow interchangeable utilization of individual tasks and their accompanying metadata. In the present work, we develop a novel computational framework, namely, MaRDIFlow, that focuses on the automation of abstracting meta-data embedded in an ontology of mathematical objects. This framework also effectively addresses the inherent execution and environmental dependencies by incorporating them into multi-layered descriptions. Additionally, we demonstrate a working prototype with example use cases and methodically integrate them into our workflow tool and data provenance framework. Furthermore, we show how to best apply the FAIR principles to computational workflows, such that abstracted components are Findable, Accessible, Interoperable, and Reusable in nature.
