Statistical Learning Analysis of Physics-Informed Neural Networks

David A. Barajas-Solano

Statistical Learning Analysis of Physics-Informed Neural Networks

David A. Barajas-Solano

TL;DR

This work reframes physics-informed neural networks (PINNs) for IBVPs as a singular statistical learning problem by enforcing hard initial and boundary constraints and interpreting the physics residuals as an infinite data source. It shows that training aims to minimize a KL divergence between the PINN residual distribution $p(y\mid x, t, w) q(x, t)$ and the true zero-residual distribution $\delta(0) q(x, t)$, rather than merely regularizing the model. The Local Learning Coefficient (LLC) is introduced and numerically estimated via MCMC to characterize the flatness of PINN loss minima, with experiments on a heat equation IBVP yielding $\hat{\lambda}(w^\star) \approx 9.5$ despite a large parameter count ($d=20{,}601$), indicating very flat minima. The analysis has implications for uncertainty quantification and extrapolation in PINNs, suggesting a function-space view of Bayesian UQ and highlighting that residual data in PINNs acts as indirect data rather than a traditional regularizer.

Abstract

We study the training and performance of physics-informed learning for initial and boundary value problems (IBVP) with physics-informed neural networks (PINNs) from a statistical learning perspective. Specifically, we restrict ourselves to parameterizations with hard initial and boundary condition constraints and reformulate the problem of estimating PINN parameters as a statistical learning problem. From this perspective, the physics penalty on the IBVP residuals can be better understood not as a regularizing term bus as an infinite source of indirect data, and the learning process as fitting the PINN distribution of residuals $p(y \mid x, t, w) q(x, t) $ to the true data-generating distribution $δ(0) q(x, t)$ by minimizing the Kullback-Leibler divergence between the true and PINN distributions. Furthermore, this analysis show that physics-informed learning with PINNs is a singular learning problem, and we employ singular learning theory tools, namely the so-called Local Learning Coefficient (Lau et al., 2025) to analyze the estimates of PINN parameters obtained via stochastic optimization for a heat equation IBVP. Finally, we discuss implications of this analysis on the quantification of predictive uncertainty of PINNs and the extrapolation capacity of PINNs.

Statistical Learning Analysis of Physics-Informed Neural Networks

TL;DR

and the true zero-residual distribution

, rather than merely regularizing the model. The Local Learning Coefficient (LLC) is introduced and numerically estimated via MCMC to characterize the flatness of PINN loss minima, with experiments on a heat equation IBVP yielding

despite a large parameter count (

), indicating very flat minima. The analysis has implications for uncertainty quantification and extrapolation in PINNs, suggesting a function-space view of Bayesian UQ and highlighting that residual data in PINNs acts as indirect data rather than a traditional regularizer.

Abstract

to the true data-generating distribution

by minimizing the Kullback-Leibler divergence between the true and PINN distributions. Furthermore, this analysis show that physics-informed learning with PINNs is a singular learning problem, and we employ singular learning theory tools, namely the so-called Local Learning Coefficient (Lau et al., 2025) to analyze the estimates of PINN parameters obtained via stochastic optimization for a heat equation IBVP. Finally, we discuss implications of this analysis on the quantification of predictive uncertainty of PINNs and the extrapolation capacity of PINNs.

Paper Structure (7 sections, 15 equations, 4 figures)

This paper contains 7 sections, 15 equations, 4 figures.

Introduction
Statistical learning interpretation of physics-informed learning for IBVPs
Numerical estimation of the local learning coefficient
MCMC-based local learning coefficient estimator
Numerical experiments
Discussion
Conclusions

Figures (4)

Figure 1: PINN solution and estimation error computed with batch size of $32$ and Adam learning rate of $1 \times 10^{-4}.$
Figure 2: Training and test histories (left, shown every 100 iterations), and LLC estimates with 95% confidence intervals (right, shown every 1,000 iterations) for various values of batch size and Adam learning rate.
Figure 3: LLC estimates with 95% confidence intervals shown every 1,000 iterations, computed for various values of batch size and Adam learning rate $\eta$ and for two intializations. Left: $\eta = 1 \times 10^{-4}$. Right: $\eta = 1 \times 10^{-3}$.
Figure 4: PINN pointwise absolute estimation errors for two different parameter vectors, $w^{0}$ and $w^{1}$, obtained via different initializations of the PINN model parameters. Both parameter estimates were computed using batch size of 32 and Adam learning rate of $1 \times 10^{-4}$ over $100,000$ iterations. Despite both cases exhibiting result in similar loss $L_n(w)$ values ($O(10^{-5}$) and similar LLCs $\hat{\lambda}(w^0) \approx \hat{\lambda}(w^1) \approx 9.5$, they exhibit noticeably different extrapolation behavior.

Statistical Learning Analysis of Physics-Informed Neural Networks

TL;DR

Abstract

Statistical Learning Analysis of Physics-Informed Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (4)