Table of Contents
Fetching ...

Mathematics of Digital Twins and Transfer Learning for PDE Models

Yifei Zong, Alexandre Tartakovsky

TL;DR

This work develops a KL-NN surrogate–based digital twin (DT) framework for PDE-governed systems by representing state and control fields with truncated Karhunen–Loève expansions and learning a reduced mapping from control KL coefficients to state KL coefficients. A moment-equation analysis quantifies transfer learning (TL) across source and target conditions, revealing that TL is exact in the linear PDE setting (one-shot TL) and only partially transferable for nonlinear PDEs, where a physics-informed last-layer retraining strategy (PI-KL-DNN) enables few-shot or even one-shot adaptation. Numerical examples on linear and nonlinear diffusion validate the theory: linear TL remains robust to target covariances, while nonlinear TL benefits from small control-variance and can be enhanced via PI-KL-DNN and data assimilation. Overall, the framework provides concrete guidance for constructing adaptable, differentiable DTs that minimize labeled data requirements under changing operating conditions, with implications for real-time control and optimization of PDE systems.

Abstract

We define a digital twin (DT) of a physical system governed by partial differential equations (PDEs) as a model for real-time simulations and control of the system behavior under changing conditions. We construct DTs using the Karhunen-Loève Neural Network (KL-NN) surrogate model and transfer learning (TL). The surrogate model allows fast inference and differentiability with respect to control parameters for control and optimization. TL is used to retrain the model for new conditions with minimal additional data. We employ the moment equations to analyze TL and identify parameters that can be transferred to new conditions. The proposed analysis also guides the control variable selection in DT to facilitate efficient TL. For linear PDE problems, the non-transferable parameters in the KL-NN surrogate model can be exactly estimated from a single solution of the PDE corresponding to the mean values of the control variables under new target conditions. Retraining an ML model with a single solution sample is known as one-shot learning, and our analysis shows that the one-shot TL is exact for linear PDEs. For nonlinear PDE problems, transferring of any parameters introduces errors. For a nonlinear diffusion PDE model, we find that for a relatively small range of control variables, some surrogate model parameters can be transferred without introducing a significant error, some can be approximately estimated from the mean-field equation, and the rest can be found using a linear residual least square problem or an ordinary linear least square problem if a small labeled dataset for new conditions is available. The former approach results in a one-shot TL while the latter approach is an example of a few-shot TL. Both methods are approximate for the nonlinear PDEs.

Mathematics of Digital Twins and Transfer Learning for PDE Models

TL;DR

This work develops a KL-NN surrogate–based digital twin (DT) framework for PDE-governed systems by representing state and control fields with truncated Karhunen–Loève expansions and learning a reduced mapping from control KL coefficients to state KL coefficients. A moment-equation analysis quantifies transfer learning (TL) across source and target conditions, revealing that TL is exact in the linear PDE setting (one-shot TL) and only partially transferable for nonlinear PDEs, where a physics-informed last-layer retraining strategy (PI-KL-DNN) enables few-shot or even one-shot adaptation. Numerical examples on linear and nonlinear diffusion validate the theory: linear TL remains robust to target covariances, while nonlinear TL benefits from small control-variance and can be enhanced via PI-KL-DNN and data assimilation. Overall, the framework provides concrete guidance for constructing adaptable, differentiable DTs that minimize labeled data requirements under changing operating conditions, with implications for real-time control and optimization of PDE systems.

Abstract

We define a digital twin (DT) of a physical system governed by partial differential equations (PDEs) as a model for real-time simulations and control of the system behavior under changing conditions. We construct DTs using the Karhunen-Loève Neural Network (KL-NN) surrogate model and transfer learning (TL). The surrogate model allows fast inference and differentiability with respect to control parameters for control and optimization. TL is used to retrain the model for new conditions with minimal additional data. We employ the moment equations to analyze TL and identify parameters that can be transferred to new conditions. The proposed analysis also guides the control variable selection in DT to facilitate efficient TL. For linear PDE problems, the non-transferable parameters in the KL-NN surrogate model can be exactly estimated from a single solution of the PDE corresponding to the mean values of the control variables under new target conditions. Retraining an ML model with a single solution sample is known as one-shot learning, and our analysis shows that the one-shot TL is exact for linear PDEs. For nonlinear PDE problems, transferring of any parameters introduces errors. For a nonlinear diffusion PDE model, we find that for a relatively small range of control variables, some surrogate model parameters can be transferred without introducing a significant error, some can be approximately estimated from the mean-field equation, and the rest can be found using a linear residual least square problem or an ordinary linear least square problem if a small labeled dataset for new conditions is available. The former approach results in a one-shot TL while the latter approach is an example of a few-shot TL. Both methods are approximate for the nonlinear PDEs.
Paper Structure (13 sections, 109 equations, 7 figures, 2 tables)

This paper contains 13 sections, 109 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Schematic description of TL for the KL-NN surrogate of the PDE \ref{['eq:component_model']}-\ref{['eq:BC']} model: KL-NN is initially trained using a sufficiently large source dataset (the orange box). The control and state variables are represented with the truncated KLDs whose means and covariances describe the range of the variables under source conditions. The DNN parameters $\bm\theta^s = (\bm{W}_{1:N+1}^s,\bm{b}_{1:N+1}^s)$ are computed from the source training dataset based on Eq \ref{['eq:surrogate_loss']}. In the inference step, the DNN inputs $\bm \xi^s$ are obtained with the inverse KLD operators acting on control variables. The DNN output $\bm \eta^s$ is transformed into the state variable using the (forward) KLD. The KL-NN model is retrained for inference under target conditions (blue box) by transferring the means and eigenfunctions of the source control variables and the state variable eigenfunctions. For nonlinear PDEs, the parameters of the last layer of the DNN, $\bm{W}_{N+1}^t$ and $\bm{b}_{N+1}^t$, are retrained using Eq. \ref{['eq:pi_loss_target']}. For linear PDEs, $\bm{W}_{1:N}^s=0$, $\bm{b}_{1:N+1}^s =0$, and $\bm{W}_{N+1}^t = \bm{W}_{N+1}^s$, i.e., the NN mapping is linear and its parameters are transferable. The target state is predicted as $\hat{\bm h}^t \approx \mathcal{KL}[\overline{\bm{h}}^t, \bm \Psi^s, \bm\eta^t]$ where $\overline{\bm{h}}^t$ is computed from the mean-filed equation subject to the target IBCs. In summary, the proposed TL-KL-NN approach enables "one-shot" learning for linear PDEs and "few-shot" learning for nonlinear PDEs.
  • Figure 2: Linear diffusion problem: (a) source and (b) target $h$ solutions versus the reference solutions at three selected times ($t_1 = T/50, t_2 = T/5, t_3 = T$). The reference solutions are shown with a solid line, and KL-NN predictions are marked with open circles.
  • Figure 3: Relative error $\varepsilon$ in the T2 target solution with $\alpha=0.5$ as a function of $\gamma$.
  • Figure 4: Nonlinear diffusion equation: the target mean $\overline{h}^t(x,t)$ solution obtained from MCS (open circles) and the mean-field equation (solid line) for $\sigma^2_y=0.1$ (left), 0.3 (middle), and 0.6 (right).
  • Figure 5: Nonlinear diffusion equation: four leading eigenfunctions of $h$ for the source (top row) and target (bottom) problems computed from MCS for (a) $\sigma^2_y=0.1$, (b) $0.3$, and (c) $0.6$.
  • ...and 2 more figures