Function+Data Flow: A Framework to Specify Machine Learning Pipelines for Digital Twinning

Eduardo de Conto; Blaise Genest; Arvind Easwaran

Function+Data Flow: A Framework to Specify Machine Learning Pipelines for Digital Twinning

Eduardo de Conto, Blaise Genest, Arvind Easwaran

TL;DR

This work tackles the ad-hoc design of ML pipelines for digital twins by introducing Function+Data Flow (FDF), a domain-specific language that treats functions as first-class citizens within a dataflow framework. FDF provides a box-based syntax with Processor, Coder, and Trainer boxes, enabling explicit manipulation and reuse of learned models, and it incorporates implicit typing to ensure type-safe composition without burdening users with manual annotations. The paper formalizes the syntax and semantics through a DAG-based FDF graph, demonstrates automatic type propagation, and applies the framework to two motivating digital-twin use cases: material-strain prediction (DTP) and magnetic bearing instance modeling (DTI), including offline and online exploitation pipelines. By decoupling learning from usage and enabling function-level transmission, FDF aims to improve flexibility, maintainability, and cross-domain applicability of DT pipelines, with future work on library development and online learning support.

Abstract

The development of digital twins (DTs) for physical systems increasingly leverages artificial intelligence (AI), particularly for combining data from different sources or for creating computationally efficient, reduced-dimension models. Indeed, even in very different application domains, twinning employs common techniques such as model order reduction and modelization with hybrid data (that is, data sourced from both physics-based models and sensors). Despite this apparent generality, current development practices are ad-hoc, making the design of AI pipelines for digital twinning complex and time-consuming. Here we propose Function+Data Flow (FDF), a domain-specific language (DSL) to describe AI pipelines within DTs. FDF aims to facilitate the design and validation of digital twins. Specifically, FDF treats functions as first-class citizens, enabling effective manipulation of models learned with AI. We illustrate the benefits of FDF on two concrete use cases from different domains: predicting the plastic strain of a structure and modeling the electromagnetic behavior of a bearing.

Function+Data Flow: A Framework to Specify Machine Learning Pipelines for Digital Twinning

TL;DR

Abstract

Paper Structure (16 sections, 3 equations, 10 figures, 1 table)

This paper contains 16 sections, 3 equations, 10 figures, 1 table.

Introduction
Related Works
Function+Data Flow Syntax
Syntax Overview
Formal Syntax
Example
Function+Data Flow Semantics
FDF Graph
FDF Execution
Implicit Typing
Implicit Types
Type propagation and checking
Application to Motivating Examples
DTP for Material Strain Prediction
DTI of a Magnetic Bearing Instance
...and 1 more sections

Figures (10)

Figure 1: Pipeline for structural health monitoring.
Figure 2: Pipeline to model an active magnetic bearing.
Figure 3: Visual syntax for boxes of Function+Data Flow: Processor on top, Coder in the middle, and Trainer at the bottom. The processor executes either a function $Func$ learned by an earlier box in the pipeline or a predefined function $PredefFunc$. The value $k$ in the Trainer's Param specifies the number of input ports to consider as $X$. The remaining ports are the $Y$ in the $(X, Y)$ supervised learning pairs.
Figure 4: Minimal FDF pipeline with annotations
Figure 5: An FDF pipeline to learn a Strain Model DTP.
...and 5 more figures

Function+Data Flow: A Framework to Specify Machine Learning Pipelines for Digital Twinning

TL;DR

Abstract

Function+Data Flow: A Framework to Specify Machine Learning Pipelines for Digital Twinning

Authors

TL;DR

Abstract

Table of Contents

Figures (10)