Convergence Guarantees for Gradient-Based Training of Neural PDE Solvers: From Linear to Nonlinear PDEs

Wei Zhao; Tao Luo

Convergence Guarantees for Gradient-Based Training of Neural PDE Solvers: From Linear to Nonlinear PDEs

Wei Zhao, Tao Luo

TL;DR

This work develops a unified convergence framework for neural PDE solvers, covering PINNs and the Deep Ritz method, across linear and nonlinear regimes. It combines an NTK-based global convergence theory for broad linear operators with a Łojasiewicz-inequality-based approach to guarantee convergence to critical points for nonlinear PDEs under a random feature model, revealing implicit regularization. The results show that gradient flow and implicit gradient descent converge under coercivity and explain parameter-bounded training trajectories without explicit regularization. Numerical experiments on Burgers', Allen–Cahn, and Fisher–KPP equations validate the theory, highlighting robustness to multiscale dynamics and limitations of NTK in nonlinear settings. The work thus unifies PDE-solver analyses and points to extensions to deeper architectures and SGD regimes as promising directions for future research.

Abstract

We present a unified convergence theory for gradient-based training of neural network methods for partial differential equations (PDEs), covering both physics-informed neural networks (PINNs) and the Deep Ritz method. For linear PDEs, we extend the neural tangent kernel (NTK) framework for PINNs to establish global convergence guarantees for a broad class of linear operators. For nonlinear PDEs, we prove convergence to critical points via the Łojasiewicz inequality under the random feature model, eliminating the need for strong over-parameterization and encompassing both gradient flow and implicit gradient descent dynamics. Our results further reveal that the random feature model exhibits an implicit regularization effect, preventing parameter divergence to infinity. Theoretical findings are corroborated by numerical experiments, providing new insights into the training dynamics and robustness of neural network PDE solvers.

Convergence Guarantees for Gradient-Based Training of Neural PDE Solvers: From Linear to Nonlinear PDEs

TL;DR

Abstract

Convergence Guarantees for Gradient-Based Training of Neural PDE Solvers: From Linear to Nonlinear PDEs

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (47)