Table of Contents
Fetching ...

Characterizing possible failure modes in physics-informed neural networks

Aditi S. Krishnapriyan, Amir Gholami, Shandian Zhe, Robert M. Kirby, Michael W. Mahoney

TL;DR

This study analyzes physics-informed neural networks (PINNs) that enforce PDE constraints as soft residuals in the loss. It demonstrates that PINNs struggle to learn convection, reaction, and reaction–diffusion physics even in relatively simple settings due to optimization difficulties in the PDE residual term, not due to limited model expressivity. By diagnosing the loss landscape and the ill-conditioning of the PDE-based regularization, the authors propose curriculum regularization and sequence-to-sequence learning as remedies, achieving 1–2 order-of-magnitude improvements in error in many cases. The findings highlight that careful optimization strategies and problem formulations are crucial for reliable SciML deployments, and they provide open-source tooling to facilitate further exploration.

Abstract

Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models. The typical approach is to incorporate physical domain knowledge as soft constraints on an empirical loss function and use existing machine learning methodologies to train the model. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena for even slightly more complex problems. In particular, we analyze several distinct situations of widespread physical interest, including learning differential equations with convection, reaction, and diffusion operators. We provide evidence that the soft regularization in PINNs, which involves PDE-based differential operators, can introduce a number of subtle problems, including making the problem more ill-conditioned. Importantly, we show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize. We then describe two promising solutions to address these failure modes. The first approach is to use curriculum regularization, where the PINN's loss term starts from a simple PDE regularization, and becomes progressively more complex as the NN gets trained. The second approach is to pose the problem as a sequence-to-sequence learning task, rather than learning to predict the entire space-time at once. Extensive testing shows that we can achieve up to 1-2 orders of magnitude lower error with these methods as compared to regular PINN training.

Characterizing possible failure modes in physics-informed neural networks

TL;DR

This study analyzes physics-informed neural networks (PINNs) that enforce PDE constraints as soft residuals in the loss. It demonstrates that PINNs struggle to learn convection, reaction, and reaction–diffusion physics even in relatively simple settings due to optimization difficulties in the PDE residual term, not due to limited model expressivity. By diagnosing the loss landscape and the ill-conditioning of the PDE-based regularization, the authors propose curriculum regularization and sequence-to-sequence learning as remedies, achieving 1–2 order-of-magnitude improvements in error in many cases. The findings highlight that careful optimization strategies and problem formulations are crucial for reliable SciML deployments, and they provide open-source tooling to facilitate further exploration.

Abstract

Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models. The typical approach is to incorporate physical domain knowledge as soft constraints on an empirical loss function and use existing machine learning methodologies to train the model. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena for even slightly more complex problems. In particular, we analyze several distinct situations of widespread physical interest, including learning differential equations with convection, reaction, and diffusion operators. We provide evidence that the soft regularization in PINNs, which involves PDE-based differential operators, can introduce a number of subtle problems, including making the problem more ill-conditioned. Importantly, we show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize. We then describe two promising solutions to address these failure modes. The first approach is to use curriculum regularization, where the PINN's loss term starts from a simple PDE regularization, and becomes progressively more complex as the NN gets trained. The second approach is to pose the problem as a sequence-to-sequence learning task, rather than learning to predict the entire space-time at once. Extensive testing shows that we can achieve up to 1-2 orders of magnitude lower error with these methods as compared to regular PINN training.

Paper Structure

This paper contains 34 sections, 20 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: Prediction error for 1D convection ( §\ref{['subsec:learning_convection']}) problem, when $\beta$ is changed. The PINN has difficulty predicting the solution past a certain timestep, but is able to fit the boundary conditions. Additional figures for different $\beta$ values can be seen in Fig. \ref{['fig:appendix_exact_vs_predicted_advection_beta']}.
  • Figure 2: Prediction error for 1D reaction-diffusion ( §\ref{['subsec:learning_rd']}) problem. We can clearly see that the PINN has difficulty predicting the solution (especially the "sharpness" of the solution) and is unable to capture the correct behavior. Additional figures for different $\nu$ values can be seen in Fig. \ref{['fig:appendix_exact_vs_predicted_rd_nu']}.
  • Figure 3: Loss landscapes for varying values of $\beta$, for the 1D convection example in §\ref{['subsec:learning_convection']}. The loss landscape is more smooth at low $\beta$, and it becomes increasingly more complex as $\beta$ increases, which can make the optimization problem more difficult. In particular, at higher $\beta$, the optimizer gets stuck in a certain regime. These results support that adding the PDE soft regularization term results in a more complex optimization loss landscape.
  • Figure 4: Schematic outlining curriculum regularization and example result for 1D convection from §\ref{['subsec:learning_convection']} The training procedure for regular PINNs training versus curriculum PINN training for the convection example in §\ref{['subsec:learning_convection']}. The regular PINN training only involves training at $\beta=30$, while curriculum regularization starts at a lower $\beta$, trains a model, and then uses the weights of this model to reinitialize the NN for training the next $\beta$. The curriculum training approach is able to do significantly better (by almost two orders of magnitude).
  • Figure 5: Schematic outlining seq2seq learning. In contrast to regular PINN training, the solution in seq2seq learning is predicted for only one $\Delta t$ step at a time. Then, the predicted solution at $t=\Delta t$ is used as the initial condition for the next segment. To allow fair comparison, we keep the total number of collocation points to be exactly the same in either approach. That is, we do not increase the number of collocation points for seq2seq learning in the right, and keep it to be the same as in the corresponding segment in the left figure.
  • ...and 8 more figures