Table of Contents
Fetching ...

Improved generalization with deep neural operators for engineering systems: Path towards digital twin

Kazuma Kobayashi, James Daniell, Syed Bahauddin Alam

TL;DR

This work evaluates DeepONet, a Neural Operator Network, as a generalizable surrogate for solving PDE/ODE operators with a branch/trunk architecture, emphasizing its applicability to digital twins. Through three test problems—an ODE system, a diffusion-reaction PDE, and the Burgers equation—the study demonstrates near-perfect $R^2$ performance for the ODE and diffusion cases (around $0.997$ to $0.999$) and identifies substantial challenges for the convection-diffusion system, where $R^2$ averages $0.437$ and some cases are negative. The results highlight DeepONet’s ability to generalize across unseen inputs (zero-shot) and across multiple conditions without retraining, while also revealing data-sensitivity and the need for improved training data strategies for complex nonlinear systems. The work argues for integrating neural operators into DT architectures with verification, uncertainty quantification, and real-time updating to realize robust, scalable digital twins in engineering contexts.

Abstract

Neural Operator Networks (ONets) represent a novel advancement in machine learning algorithms, offering a robust and generalizable alternative for approximating partial differential equations (PDEs) solutions. Unlike traditional Neural Networks (NN), which directly approximate functions, ONets specialize in approximating mathematical operators, enhancing their efficacy in addressing complex PDEs. In this work, we evaluate the capabilities of Deep Operator Networks (DeepONets), an ONets implementation using a branch/trunk architecture. Three test cases are studied: a system of ODEs, a general diffusion system, and the convection/diffusion Burgers equation. It is demonstrated that DeepONets can accurately learn the solution operators, achieving prediction accuracy scores above 0.96 for the ODE and diffusion problems over the observed domain while achieving zero shot (without retraining) capability. More importantly, when evaluated on unseen scenarios (zero shot feature), the trained models exhibit excellent generalization ability. This underscores ONets vital niche for surrogate modeling and digital twin development across physical systems. While convection-diffusion poses a greater challenge, the results confirm the promise of ONets and motivate further enhancements to the DeepONet algorithm. This work represents an important step towards unlocking the potential of digital twins through robust and generalizable surrogates.

Improved generalization with deep neural operators for engineering systems: Path towards digital twin

TL;DR

This work evaluates DeepONet, a Neural Operator Network, as a generalizable surrogate for solving PDE/ODE operators with a branch/trunk architecture, emphasizing its applicability to digital twins. Through three test problems—an ODE system, a diffusion-reaction PDE, and the Burgers equation—the study demonstrates near-perfect performance for the ODE and diffusion cases (around to ) and identifies substantial challenges for the convection-diffusion system, where averages and some cases are negative. The results highlight DeepONet’s ability to generalize across unseen inputs (zero-shot) and across multiple conditions without retraining, while also revealing data-sensitivity and the need for improved training data strategies for complex nonlinear systems. The work argues for integrating neural operators into DT architectures with verification, uncertainty quantification, and real-time updating to realize robust, scalable digital twins in engineering contexts.

Abstract

Neural Operator Networks (ONets) represent a novel advancement in machine learning algorithms, offering a robust and generalizable alternative for approximating partial differential equations (PDEs) solutions. Unlike traditional Neural Networks (NN), which directly approximate functions, ONets specialize in approximating mathematical operators, enhancing their efficacy in addressing complex PDEs. In this work, we evaluate the capabilities of Deep Operator Networks (DeepONets), an ONets implementation using a branch/trunk architecture. Three test cases are studied: a system of ODEs, a general diffusion system, and the convection/diffusion Burgers equation. It is demonstrated that DeepONets can accurately learn the solution operators, achieving prediction accuracy scores above 0.96 for the ODE and diffusion problems over the observed domain while achieving zero shot (without retraining) capability. More importantly, when evaluated on unseen scenarios (zero shot feature), the trained models exhibit excellent generalization ability. This underscores ONets vital niche for surrogate modeling and digital twin development across physical systems. While convection-diffusion poses a greater challenge, the results confirm the promise of ONets and motivate further enhancements to the DeepONet algorithm. This work represents an important step towards unlocking the potential of digital twins through robust and generalizable surrogates.
Paper Structure (23 sections, 7 equations, 16 figures, 8 tables)

This paper contains 23 sections, 7 equations, 16 figures, 8 tables.

Figures (16)

  • Figure 1: Intelligent Digital Twin Framework with Explainable AI and Interpretable ML module. The diagram shows the ML components (Red Boxes) exploited in different segments of the digital twin framework kazuma_eaai.
  • Figure 2: Concept of surrogate modeling method. The surrogate model can return predictions immediately when the input variables are given. The demand for conventional simulations has not disappeared to prepare the training data for ML modeling.
  • Figure 3: Acceleration and position of the mass-spring system described by equation \ref{['eq:mass-spring']}. $T_{s}=0$ represents the ideal solutions. $T_{s} = 4000$ and $T_{s}=8000$ represent system operation time in days.
  • Figure 4: Concept of update module in DT - synchronizing the physical system and DT kobayashi2023explainable.
  • Figure 5: DeepONet Branch-Trunk Architecture following the proposed approach from DeepONet_Lu. The training dataset is composed of (1) input function $u(x_{m})$, (2) sampling positions $P$ for the system output, and (3) system output $s$ at the position $P$.
  • ...and 11 more figures