Table of Contents
Fetching ...

A Digital Twin for Diesel Engines: Operator-infused Physics-Informed Neural Networks with Transfer Learning for Engine Health Monitoring

Kamaljyoti Nath, Varun Kumar, Daniel J. Smith, George Em Karniadakis

TL;DR

The study addresses the need for fast, interpretable, system-level health monitoring of diesel engines by marrying physics-informed neural networks with neural operators and transfer learning. It introduces an operator-infused PINN framework that uses DeepONet surrogates for actuator dynamics, DNN surrogates for empirical engine relations, and PINN-based inverse modeling to identify unseen parameters from field data. To reduce online computation, two transfer-learning schemes—multi-stage TL and few-shot TL—are proposed and benchmarked against a baseline PINN, demonstrating substantial reductions in training time and improved generalization under noise. The approach offers enhanced interpretability and robustness over purely data-driven methods, enabling more reliable engine diagnostics and paving the way for real-time digital twins in diesel engine health monitoring. Overall, the proposed hybrid framework is demonstrated on simulated lab/field-like data, showing accurate parameter and state estimation with quantified uncertainty and significant computational efficiency gains, suitable for field deployment and system-level diagnostics.

Abstract

Improving diesel engine efficiency, reducing emissions, and enabling robust health monitoring have been critical research topics in engine modelling. While recent advancements in the use of neural networks for system monitoring have shown promising results, such methods often focus on component-level analysis, lack generalizability, and physical interpretability. In this study, we propose a novel hybrid framework that combines physics-informed neural networks (PINNs) with deep operator networks (DeepONet) to enable accurate and computationally efficient parameter identification in mean-value diesel engine models. Our method leverages physics-based system knowledge in combination with data-driven training of neural networks to enhance model applicability. Incorporating offline-trained DeepONets to predict actuator dynamics significantly lowers the online computation cost when compared to the existing PINN framework. To address the re-training burden typical of PINNs under varying input conditions, we propose two transfer learning (TL) strategies: (i) a multi-stage TL scheme offering better runtime efficiency than full online training of the PINN model and (ii) a few-shot TL scheme that freezes a shared multi-head network body and computes physics-based derivatives required for model training outside the training loop. The second strategy offers a computationally inexpensive and physics-based approach for predicting engine dynamics and parameter identification, offering computational efficiency over the existing PINN framework. Compared to existing health monitoring methods, our framework combines the interpretability of physics-based models with the flexibility of deep learning, offering substantial gains in generalization, accuracy, and deployment efficiency for diesel engine diagnostics.

A Digital Twin for Diesel Engines: Operator-infused Physics-Informed Neural Networks with Transfer Learning for Engine Health Monitoring

TL;DR

The study addresses the need for fast, interpretable, system-level health monitoring of diesel engines by marrying physics-informed neural networks with neural operators and transfer learning. It introduces an operator-infused PINN framework that uses DeepONet surrogates for actuator dynamics, DNN surrogates for empirical engine relations, and PINN-based inverse modeling to identify unseen parameters from field data. To reduce online computation, two transfer-learning schemes—multi-stage TL and few-shot TL—are proposed and benchmarked against a baseline PINN, demonstrating substantial reductions in training time and improved generalization under noise. The approach offers enhanced interpretability and robustness over purely data-driven methods, enabling more reliable engine diagnostics and paving the way for real-time digital twins in diesel engine health monitoring. Overall, the proposed hybrid framework is demonstrated on simulated lab/field-like data, showing accurate parameter and state estimation with quantified uncertainty and significant computational efficiency gains, suitable for field deployment and system-level diagnostics.

Abstract

Improving diesel engine efficiency, reducing emissions, and enabling robust health monitoring have been critical research topics in engine modelling. While recent advancements in the use of neural networks for system monitoring have shown promising results, such methods often focus on component-level analysis, lack generalizability, and physical interpretability. In this study, we propose a novel hybrid framework that combines physics-informed neural networks (PINNs) with deep operator networks (DeepONet) to enable accurate and computationally efficient parameter identification in mean-value diesel engine models. Our method leverages physics-based system knowledge in combination with data-driven training of neural networks to enhance model applicability. Incorporating offline-trained DeepONets to predict actuator dynamics significantly lowers the online computation cost when compared to the existing PINN framework. To address the re-training burden typical of PINNs under varying input conditions, we propose two transfer learning (TL) strategies: (i) a multi-stage TL scheme offering better runtime efficiency than full online training of the PINN model and (ii) a few-shot TL scheme that freezes a shared multi-head network body and computes physics-based derivatives required for model training outside the training loop. The second strategy offers a computationally inexpensive and physics-based approach for predicting engine dynamics and parameter identification, offering computational efficiency over the existing PINN framework. Compared to existing health monitoring methods, our framework combines the interpretability of physics-based models with the flexibility of deep learning, offering substantial gains in generalization, accuracy, and deployment efficiency for diesel engine diagnostics.

Paper Structure

This paper contains 26 sections, 76 equations, 34 figures, 11 tables.

Figures (34)

  • Figure 1: A schematic diagram of the mean value diesel engine with a variable-geometry turbocharger and exhaust gas recirculation Wahlstrom. The main components of the engine are the intake manifold, the exhaust manifold, the cylinder, the EGR valve system, the compressor, and the turbine. The control input vector is $\bm{u} = \{u_\delta, u_{egr}, u_{vgt}\}$, and the engine speed is $n_e$. The gas flow rates through the different subsystems are denoted by $W_{i}$'s (in blue) and represent the gas flow dynamics. The blue flow path represents clean airflow while the red path represents air-combustion mixture. (Source: Figure adapted from Wahlstrom)
  • Figure 2: Schematic of a hybrid model of DeepONet and physics-informed neural networks (PINNs) for inverse problems. In the left part of the figure, a PINN approximates the solution $y$ to a differential equation with time $t$ as input. The top left section (enclosed within black hatched lines) represents another DNN which takes as input $\hat{y}$ (and potentially other inputs such as ambient condition) and outputs a function $g(\hat{y})$. This network is pre-trained with labelled laboratory data. On the top right section of the figure (enclosed in black dashed line), a DeepONet model takes inputs of time $t$ and an input function $u$ to approximate a state $G(u)(t)$. This network is also pre-trained with labelled laboratory data. The physics/residual loss is shown in the bottom right part of the figure (enclosed within the blue dashed line). The PINN approximates the solution to differential equations. The total loss $\mathcal{L}(\bm{\theta})$ includes the loss of equation as well as the data. Weights $\lambda_1$ and $\lambda_2$ are applied to the data and physics loss respectively, and may be fixed or adaptive depending on the problem and solution method. $\bm{\theta} = \{\bm{W}, \bm{b}, \bm{\Lambda}\}$ represents the parameters of the PINN, where $\bm{W}$ and $\bm{b}$ are the weights and biases, and $\bm{\Lambda}$ are the unknown parameters of the ODE; $\sigma$ denotes the activation function used for the PINN model, and $r$ is the residual for the equation. Further, $\bm{\theta}^P = \{\bm{W}^P, \bm{b}^P\}$ denotes the parameters of the pre-trained neural network $g(y)$. Similarly, $\bm{\theta}_{DP} = \{\bm{W}_{DP}, \bm{b}_{DP}\}$ represents the parameters of the pre-trained DeepONet $G(u)(t)$.
  • Figure 3: Schematic for Multi-stage TL showing the network training protocol for different segments of time. In this approach, the hybrid PINN model is trained for the first segment of time ($0\sim60$ sec). Thereafter, for the subsequent time segments ($60\sim300$ sec in each one-minute segment), we transfer the network parameters ($\bm{\theta}$) and engine parameter estimated ($\bm{\Lambda}$) from the hybrid PINN model trained on the first time segment. Each segment uses a different control vector $\{u_\delta, u_{egr}, u_{vgt}$}, and engine speed $n_{e}$. The adaptive weights ($\bm{\lambda}$) are initialized at the start of training at epoch=0. The network weights and biases of all the layers ($\bm{\theta}$), self-adaptive weights ($\bm{\lambda}$), and unknown parameters ($\bm{\Lambda}$) are updated in Stage I training phase. In Stage II, only weights and biases of the last layer of the PINN model ($\bm{\theta_{l}}$) are trained while freezing all the hidden layers ($\bm{\theta_{h}}$). Additionally, $\bm{\lambda}$ and $\bm{\Lambda}$ continue to be updated in this phase. In Stage III, we continue to update $\bm{\theta_{l}}$ and $\bm{\Lambda}$ while the self-adaptive weights $\bm{\lambda}$ are fixed to the values obtained at the end of Stage II. In total, for predicting across a 5-minute time window, we train five models, with the model trained for segment 1 (0-60 sec) serving as the reference model from which the remaining models acquire knowledge during the TL phase.
  • Figure 4: Schematic diagram for the few-shot TL architecture with multi-head network. The left side of the plot shows Phase I of the few-shot TL approach. In this phase, we train the multi-head networks using laboratory data where each head approximates one minute of response (corresponding to a different time zone). Input to this network is time $0 - 60$ sec, normalized between $[-1,1]$. Phase I training is a data-driven offline training. On the right side of the plot, Phase II, which involves TL, is shown. In this phase, we transfer the hidden layer and derivative of the output of the last hidden layer from the network trained in Phase I. We randomly initialize the weights and biases of the last layer along with the unknown parameters. The output and the derivative of the output are predicted using Eqs. \ref{['Eq:Few shot output']} and \ref{['Eq:Few shot derivative']}, respectively. We formulate the hybrid PINN model by combining the PINN network with DeepONet and empirical DNN. The total loss ($\mathcal{L}_{\bm{\theta}}$) is calculated as the weighted sum of the physics and data loss. The weights $\lambda_1$ and $\lambda_2$ may be constant or adaptive depending on the problem setup and solution method. The trainable parameters are the weights and bias of the output layer of the PINN network ($\bm{\theta}_{out} = \{\bm{W}_{out}, b_{out}\}$), the unknown parameters of the equation, the self-adaptive (in case of self-adaptive) weights. The weight and biases of the hidden layer of the PINN network, the parameters of the pre-trained network for empirical formula, and the parameters of the DeepONets remain fixed during the optimization process.
  • Figure 5: Violin plot, multi-stage TL, clean data, 151 data points: variation in unknown parameter predictions over 30 runs with clean data and 151 data points. 151 data points are considered for each of the known parameters (for each minute) under clean conditions. The violin plots are limited to the maximum and minimum values of the predicted parameters. The dots on the violin plots show a swarm plot indicating each predicted value for the 30 independent runs. The grey band shown indicates ± 7.5% error band. The dotted line in the centre indicates the true value of the unknown parameter. The top dotted line indicates 1.075 times the true value while the bottom dotted line indicates 0.925 times the true value.
  • ...and 29 more figures