Table of Contents
Fetching ...

What You See is Not What You Get: Neural Partial Differential Equations and The Illusion of Learning

Arvind Mohan, Ashesh Chattopadhyay, Jonah Miller

TL;DR

This study shows that NeuralPDEs learn the artifacts in the simulation training data arising from the discretized Taylor Series truncation error of the spatial derivatives, and observes that the initial condition constrains the truncation error in initial-value problems in PDEs, thereby exerting limitations to extrapolation.

Abstract

Differentiable Programming for scientific machine learning (SciML) has recently seen considerable interest and success, as it directly embeds neural networks inside PDEs, often called as NeuralPDEs, derived from first principle physics. Therefore, there is a widespread assumption in the community that NeuralPDEs are more trustworthy and generalizable than black box models. However, like any SciML model, differentiable programming relies predominantly on high-quality PDE simulations as "ground truth" for training. However, mathematics dictates that these are only discrete numerical approximations of the true physics. Therefore, we ask: Are NeuralPDEs and differentiable programming models trained on PDE simulations as physically interpretable as we think? In this work, we rigorously attempt to answer these questions, using established ideas from numerical analysis, experiments, and analysis of model Jacobians. Our study shows that NeuralPDEs learn the artifacts in the simulation training data arising from the discretized Taylor Series truncation error of the spatial derivatives. Additionally, NeuralPDE models are systematically biased, and their generalization capability is likely enabled by a fortuitous interplay of numerical dissipation and truncation error in the training dataset and NeuralPDE, which seldom happens in practical applications. This bias manifests aggressively even in relatively accessible 1-D equations, raising concerns about the veracity of differentiable programming on complex, high-dimensional, real-world PDEs, and in dataset integrity of foundation models. Further, we observe that the initial condition constrains the truncation error in initial-value problems in PDEs, thereby exerting limitations to extrapolation. Finally, we demonstrate that an eigenanalysis of model weights can indicate a priori if the model will be inaccurate for out-of-distribution testing.

What You See is Not What You Get: Neural Partial Differential Equations and The Illusion of Learning

TL;DR

This study shows that NeuralPDEs learn the artifacts in the simulation training data arising from the discretized Taylor Series truncation error of the spatial derivatives, and observes that the initial condition constrains the truncation error in initial-value problems in PDEs, thereby exerting limitations to extrapolation.

Abstract

Differentiable Programming for scientific machine learning (SciML) has recently seen considerable interest and success, as it directly embeds neural networks inside PDEs, often called as NeuralPDEs, derived from first principle physics. Therefore, there is a widespread assumption in the community that NeuralPDEs are more trustworthy and generalizable than black box models. However, like any SciML model, differentiable programming relies predominantly on high-quality PDE simulations as "ground truth" for training. However, mathematics dictates that these are only discrete numerical approximations of the true physics. Therefore, we ask: Are NeuralPDEs and differentiable programming models trained on PDE simulations as physically interpretable as we think? In this work, we rigorously attempt to answer these questions, using established ideas from numerical analysis, experiments, and analysis of model Jacobians. Our study shows that NeuralPDEs learn the artifacts in the simulation training data arising from the discretized Taylor Series truncation error of the spatial derivatives. Additionally, NeuralPDE models are systematically biased, and their generalization capability is likely enabled by a fortuitous interplay of numerical dissipation and truncation error in the training dataset and NeuralPDE, which seldom happens in practical applications. This bias manifests aggressively even in relatively accessible 1-D equations, raising concerns about the veracity of differentiable programming on complex, high-dimensional, real-world PDEs, and in dataset integrity of foundation models. Further, we observe that the initial condition constrains the truncation error in initial-value problems in PDEs, thereby exerting limitations to extrapolation. Finally, we demonstrate that an eigenanalysis of model weights can indicate a priori if the model will be inaccurate for out-of-distribution testing.

Paper Structure

This paper contains 12 sections, 17 equations, 10 figures.

Figures (10)

  • Figure 1: Ground Truth Expts 1 and 2 for training $\phi_{0} = 2$
  • Figure 2: Burgers NeuralPDE model sensitivity when $p=k$ (Expt 1) and $p \neq k$ (Expt 2) for training IC and unseen IC. Expt 2 shows degraded performance even for training IC, and collapses at some ICs, while Expt 1 remains stable and relatively accurate.
  • Figure 3: Log of RMS error growth with variance in IC for Expt 1 and 2. Expt 2 ($p \neq k$) shows aggressive increase in error compared to Expt 1 ($p=k$).
  • Figure 4: gKdV Expt 1 and Expt 2 Predictions for $c^{train} = 2.5$
  • Figure 5: gKdV NeuralPDE model sensitivity when $p=k$ (Expt 1) and $p \neq k$ (Expt 2) for unseen $c$.
  • ...and 5 more figures