Table of Contents
Fetching ...

A multifidelity approach to continual learning for physical systems

Amanda Howard, Yucheng Fu, Panos Stinis

TL;DR

This work tackles catastrophic forgetting in sequential learning for physical systems by introducing Multifidelity Continual Learning (MFCL), which leverages correlations between outputs of previously trained models and current-domain predictions. MFCL trains a current-domain model to correct a prior model’s (low-fidelity) output via linear and nonlinear subnetworks, enabling accurate learning across multiple subdomains while using smaller networks. The framework synergizes with physics-informed neural networks (PINNs) and can be enhanced with memory-aware synapses (MAS) and replay, achieving superior retention on long-time simulations (e.g., pendulum dynamics, Allen-Cahn equation) and data-informed tasks (vanadium redox-flow batteries, energy consumption). Results indicate MFCL reduces forgetting, provides robustness to hyperparameters, and offers privacy-friendly and potentially federated learning-friendly advantages, with code and data available for reproducibility.

Abstract

We introduce a novel continual learning method based on multifidelity deep neural networks. This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset, limiting catastrophic forgetting. On its own the multifidelity continual learning method shows robust results that limit forgetting across several datasets. Additionally, we show that the multifidelity method can be combined with existing continual learning methods, including replay and memory aware synapses, to further limit catastrophic forgetting. The proposed continual learning method is especially suited for physical problems where the data satisfy the same physical laws on each domain, or for physics-informed neural networks, because in these cases we expect there to be a strong correlation between the output of the previous model and the model on the current training domain.

A multifidelity approach to continual learning for physical systems

TL;DR

This work tackles catastrophic forgetting in sequential learning for physical systems by introducing Multifidelity Continual Learning (MFCL), which leverages correlations between outputs of previously trained models and current-domain predictions. MFCL trains a current-domain model to correct a prior model’s (low-fidelity) output via linear and nonlinear subnetworks, enabling accurate learning across multiple subdomains while using smaller networks. The framework synergizes with physics-informed neural networks (PINNs) and can be enhanced with memory-aware synapses (MAS) and replay, achieving superior retention on long-time simulations (e.g., pendulum dynamics, Allen-Cahn equation) and data-informed tasks (vanadium redox-flow batteries, energy consumption). Results indicate MFCL reduces forgetting, provides robustness to hyperparameters, and offers privacy-friendly and potentially federated learning-friendly advantages, with code and data available for reproducibility.

Abstract

We introduce a novel continual learning method based on multifidelity deep neural networks. This method learns the correlation between the output of previously trained models and the desired output of the model on the current training dataset, limiting catastrophic forgetting. On its own the multifidelity continual learning method shows robust results that limit forgetting across several datasets. Additionally, we show that the multifidelity method can be combined with existing continual learning methods, including replay and memory aware synapses, to further limit catastrophic forgetting. The proposed continual learning method is especially suited for physical problems where the data satisfy the same physical laws on each domain, or for physics-informed neural networks, because in these cases we expect there to be a strong correlation between the output of the previous model and the model on the current training domain.
Paper Structure (17 sections, 11 equations, 17 figures, 5 tables)

This paper contains 17 sections, 11 equations, 17 figures, 5 tables.

Figures (17)

  • Figure 1: Graphical abstract
  • Figure 2: Diagram of the MF-CL method on domain $\Omega_i$. The output from the previously trained neural network, $\mathcal{NN}_{i-1}(\mathbf{x}, t; \gamma_{i-1})$, is used as input to the linear and nonlinear subnets for a point $(\mathbf{x}, t) \in \Omega_i$, $\mathbf{x}\in \mathbb{R}^N$. The output neural network is the sum of the linear and nonlinear subnetworks.
  • Figure 3: Results from training a single PINN to satisfy Eqs. \ref{['eq:pendulum_1']} and \ref{['eq:pendulum_2']} (solid lines) compared with the exact solution (dotted line) for $s_1$ (left) and $s_2$ (right). The results decay to zero quickly and the learned solution does not agree well with the exact solution.
  • Figure 4: Results from training the single fidelity (a) and multifidelity (b) alone to satisfy Eqs. \ref{['eq:pendulum_1']} and \ref{['eq:pendulum_2']} compared with the exact solution (dash-dotted line) for $s_1$ (left) and $s_2$ (right). Of particular importance is the final network, $\mathcal{NN}_5$ (blue solid line), which is trained on $\Omega_5 = [8, 10]$. While the multifidelity results in (b) have significant errors, the are substantially better than the single fidelity results in (a). In the single fidelity training, each network $\mathcal{NN}_i$ is only accurate on the subdomain $\Omega_i$, and extrapolation outside $\Omega_i$ presents significant difficulties.
  • Figure 5: Results from training the single fidelity (a) and multifidelity (b) with MAS to satisfy Eqs. \ref{['eq:pendulum_1']} and \ref{['eq:pendulum_2']} compared with the exact solution (dash-dotted line) for $s_1$ (left) and $s_2$ (right). Of particular importance is the final network, $\mathcal{NN}_5$ (blue solid line), which is trained on $\Omega_5 = [8, 10]$. These simulations plotted here have the smallest RMSEs of $\mathcal{NN}_5$ on $\Omega$ of any of the sets of hyperparameters tested. In the single fidelity case, MAS appears to cause restrictions in training that are too strict, and later networks $\mathcal{NN}_i$ are no longer accurate on their respective domains $\Omega_i$. For the multifidelity training, the solutions are accurate across a wider portion of the full domain, and the RMSE is decreased compared with multifidelity training alone.
  • ...and 12 more figures