Preconditioning for Physics-Informed Neural Networks

Songming Liu; Chang Su; Jiachen Yao; Zhongkai Hao; Hang Su; Youjia Wu; Jun Zhu

Preconditioning for Physics-Informed Neural Networks

Songming Liu, Chang Su, Jiachen Yao, Zhongkai Hao, Hang Su, Youjia Wu, Jun Zhu

TL;DR

This work identifies convergence pathologies in physics-informed neural networks (PINNs) and introduces a condition-number framework to diagnose and mitigate them. By defining the relative condition number $cond(\mathcal{P})$ and deriving error and convergence bounds via Lipschitz properties and neural tangent kernel theory, the authors motivate a preconditioning strategy (PCPINN) built on an ILU-based operator. Empirical results on the PINNacle benchmark show state-of-the-art performance, including major error reductions and solving problems previously intractable, highlighting the practical impact of conditioning PINNs. Limitations include reliance on meshing for conditioning improvements and challenges in scaling to very high dimensions, with future work aimed at learning data-driven preconditioners using neural networks.

Abstract

Physics-informed neural networks (PINNs) have shown promise in solving various partial differential equations (PDEs). However, training pathologies have negatively affected the convergence and prediction accuracy of PINNs, which further limits their practical applications. In this paper, we propose to use condition number as a metric to diagnose and mitigate the pathologies in PINNs. Inspired by classical numerical analysis, where the condition number measures sensitivity and stability, we highlight its pivotal role in the training dynamics of PINNs. We prove theorems to reveal how condition number is related to both the error control and convergence of PINNs. Subsequently, we present an algorithm that leverages preconditioning to improve the condition number. Evaluations of 18 PDE problems showcase the superior performance of our method. Significantly, in 7 of these problems, our method reduces errors by an order of magnitude. These empirical findings verify the critical role of the condition number in PINNs' training.

Preconditioning for Physics-Informed Neural Networks

TL;DR

and deriving error and convergence bounds via Lipschitz properties and neural tangent kernel theory, the authors motivate a preconditioning strategy (PCPINN) built on an ILU-based operator. Empirical results on the PINNacle benchmark show state-of-the-art performance, including major error reductions and solving problems previously intractable, highlighting the practical impact of conditioning PINNs. Limitations include reliance on meshing for conditioning improvements and challenges in scaling to very high dimensions, with future work aimed at learning data-driven preconditioners using neural networks.

Abstract

Paper Structure (81 sections, 10 theorems, 132 equations, 4 figures, 14 tables, 4 algorithms)

This paper contains 81 sections, 10 theorems, 132 equations, 4 figures, 14 tables, 4 algorithms.

Introduction
Preliminaries
Training Pathologies.
Analyzing PINNs' Training Pathologies via Condition Number
Introducing Condition Number
How Condition Number Affects Error & Convergence
Training PINNs with a Preconditioner
Discretization of PDEs.
Preconditioning Algorithm.
Time-Dependent & Nonlinear Problems.
Non-Uniform Mesh & Modern Numerical Schemes.
Numerical Experiments
Overview
Relationship Between Condition Number and Error & Convergence
Benchmark of Forward Problems
...and 66 more sections

Key Result

Theorem 3.3

If $\mathcal{F}^{-1}$ is $K$-Lipschitz continuous with $K\ge 0$ in some neighbourhood of $f$, we have:

Figures (4)

Figure 1: An illustrative example of learning 1D wave equation. (a) PINN baselines (only a subset are shown) struggle with long plateaus and severe oscillations during training. In contrast, our preconditioned PINN (PCPINN) can converge quickly and achieve much lower $L^2$ relative error (L2RE). (b) PINN wanders in the high-error zone (red), while ours dives deep and eventually converges. Red scatters mark the model parameters in each iteration. Details are elaborated in Section \ref{['sec:exp:forward']}.
Figure 2: (a): Estimations of $\| \mathcal{F}^{-1} \|$ across different $P$ values, with the number after "FDM" indicating the mesh size. (b): Strong linear correlation between normalized condition numbers and associated errors. (c): Convergence in the wave equation across different condition numbers.
Figure 3: (a): Computation time of PCPINN (ours) and vanilla PINN in selected problems, with error bars showing the $[\mathrm{min}, \mathrm{max}]$ in 5 trials. (b): Scaling law of computational time relative to an 8K grid size, contrasting our PCPINN with the preconditioned conjugate gradient method (PCG) and the preconditioning (ILU). (c): Convergence dynamics under varying preconditioner precision, with the dashed line for no preconditioner and the color bar for condition numbers: $\frac{\| {\bm{P}}^{-1}{\bm{b}} \|}{\| {\bm{u}} \|} \| {\bm{A}}^{-1}{\bm{P}} \|$ under different preconditioner precisions.
Figure 4: The training L2 relative error (L2RE) in ablation study. The dashed line marks the trajectory corresponding to the one without the preconditioner.

Theorems & Definitions (30)

Definition 3.1: Condition Number
Remark 3.2
Theorem 3.3
proof
Remark 3.4
Theorem 3.5
proof
Corollary 3.6: Error Control
proof
Remark 3.7
...and 20 more

Preconditioning for Physics-Informed Neural Networks

TL;DR

Abstract

Preconditioning for Physics-Informed Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (30)