Table of Contents
Fetching ...

Noisy PDE Training Requires Bigger PINNs

Sebastien Andre-Sloan, Anirbit Mukherjee, Matthew Colbrook

TL;DR

It is demonstrated that PINNs indeed need to be beyond a threshold model size for them to train to errors below $\sigma^2$.

Abstract

Physics-Informed Neural Networks (PINNs) are increasingly used to approximate solutions of partial differential equations (PDEs), particularly in high dimensions. In real-world settings, data are often noisy, making it crucial to understand when a predictor can still achieve low empirical risk. Yet, little is known about the conditions under which a PINN can do so effectively. We analyse PINNs applied to the Hamilton--Jacobi--Bellman (HJB) PDE and establish a lower bound on the network size required for the supervised PINN empirical risk to fall below the variance of noisy supervision labels. Specifically, if a predictor achieves empirical risk $O(η)$ below $σ^2$ (the variance of the supervision data), then necessarily $d_N\log d_N\gtrsim N_s η^2$, where $N_s$ is the number of samples and $d_N$ the number of trainable parameters. A similar constraint holds in the fully unsupervised PINN setting when boundary labels are noisy. Thus, simply increasing the number of noisy supervision labels does not offer a ``free lunch'' in reducing empirical risk. We also give empirical studies on the HJB PDE, the Poisson PDE and the the Navier-Stokes PDE set to produce the Taylor-Green solutions. In these experiments we demonstrate that PINNs indeed need to be beyond a threshold model size for them to train to errors below $σ^2$. These results provide a quantitative foundation for understanding parameter requirements when training PINNs in the presence of noisy data.

Noisy PDE Training Requires Bigger PINNs

TL;DR

It is demonstrated that PINNs indeed need to be beyond a threshold model size for them to train to errors below .

Abstract

Physics-Informed Neural Networks (PINNs) are increasingly used to approximate solutions of partial differential equations (PDEs), particularly in high dimensions. In real-world settings, data are often noisy, making it crucial to understand when a predictor can still achieve low empirical risk. Yet, little is known about the conditions under which a PINN can do so effectively. We analyse PINNs applied to the Hamilton--Jacobi--Bellman (HJB) PDE and establish a lower bound on the network size required for the supervised PINN empirical risk to fall below the variance of noisy supervision labels. Specifically, if a predictor achieves empirical risk below (the variance of the supervision data), then necessarily , where is the number of samples and the number of trainable parameters. A similar constraint holds in the fully unsupervised PINN setting when boundary labels are noisy. Thus, simply increasing the number of noisy supervision labels does not offer a ``free lunch'' in reducing empirical risk. We also give empirical studies on the HJB PDE, the Poisson PDE and the the Navier-Stokes PDE set to produce the Taylor-Green solutions. In these experiments we demonstrate that PINNs indeed need to be beyond a threshold model size for them to train to errors below . These results provide a quantitative foundation for understanding parameter requirements when training PINNs in the presence of noisy data.

Paper Structure

This paper contains 20 sections, 11 theorems, 101 equations, 1 figure.

Key Result

Theorem 4.1

Consider the data and PINN loss function for the HJB PDE as given in lossfunction. Let the initial condition $g$ be bounded by $\|g\|_\infty \leq G$, and let $\sigma^2=\frac{1}{N_s}\sum_{i=1}^{N_s}\mathbb{E}[z_i^2]$, where $z_i=y_{si}-\mathbb{E}[y_{si}|(x_{si},t_{si})]$, with supervision data sample then the size of the model ($d_N$) is lower bounded by the number of supervision samples ($N_s$):

Figures (1)

  • Figure 1: Each row represents PINN experiments on different PDEs (Navier Stokes, Poisson, and HJB) for neural nets consistent with the theoretical assumptions, whose experimental details are described in Section \ref{['sec:num_experiments']}. For each PDE we tested at two different supervision noise variances $\sigma^2$. The vertical axis is the training error. The red dots represent the final training error, while the blue dots show the training accuracy every 100 updates, except for Navier Stokes where it is every 10. The accuracy error diminshes as the size increases before plateauing at a level below the variance given by the red line.

Theorems & Definitions (15)

  • Definition 3.1: Class of NNs
  • Definition 3.2
  • Theorem 4.1
  • Remark 4.2: Good solvers
  • Remark 4.3: Extension to general $\eta$
  • Theorem 4.4
  • Lemma 4.5
  • Lemma 4.6
  • Lemma 4.7: Perturbation bound for PINN empirical risk for HJB
  • Lemma 4.1: Upper bound on $\partial_t h_{\bm{w}} - \partial_t h_{{\bm{w}}_\frac{\eta}{2}}$
  • ...and 5 more