Table of Contents
Fetching ...

ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning

Nilo Schwencke, Cyril Furtlehner

TL;DR

ANaGRAM addresses training efficiency and accuracy limitations of Physics-Informed Neural Networks (PINNs) for solving PDEs by introducing an empirical natural-gradient method with complexity ${\mathcal{O}}(\min(P^2S, S^2P))$ and a functional-analytic reformulation that extends natural gradient to PINNs and links to Green's functions. The authors prove that the PINN natural-gradient update corresponds to a generalized Green's function on the tangent space and demonstrate robust performance across multiple PDE benchmarks (2D and 5D Laplace, Heat, Allen-Cahn), often outperforming standard NG approaches with competitive compute times. The work provides a geometric, scalable pathway for geometry-aware optimization in PINNs and suggests practical avenues for data assimilation and high-dimensional PDE solvers. Overall, ANaGRAM offers both theoretical insight and practical improvements for training PINNs on challenging PDE problems.

Abstract

In the recent years, Physics Informed Neural Networks (PINNs) have received strong interest as a method to solve PDE driven systems, in particular for data assimilation purpose. This method is still in its infancy, with many shortcomings and failures that remain not properly understood. In this paper we propose a natural gradient approach to PINNs which contributes to speed-up and improve the accuracy of the training. Based on an in depth analysis of the differential geometric structures of the problem, we come up with two distinct contributions: (i) a new natural gradient algorithm that scales as $\min(P^2S, S^2P)$, where $P$ is the number of parameters, and $S$ the batch size; (ii) a mathematically principled reformulation of the PINNs problem that allows the extension of natural gradient to it, with proved connections to Green's function theory.

ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning

TL;DR

ANaGRAM addresses training efficiency and accuracy limitations of Physics-Informed Neural Networks (PINNs) for solving PDEs by introducing an empirical natural-gradient method with complexity and a functional-analytic reformulation that extends natural gradient to PINNs and links to Green's functions. The authors prove that the PINN natural-gradient update corresponds to a generalized Green's function on the tangent space and demonstrate robust performance across multiple PDE benchmarks (2D and 5D Laplace, Heat, Allen-Cahn), often outperforming standard NG approaches with competitive compute times. The work provides a geometric, scalable pathway for geometry-aware optimization in PINNs and suggests practical avenues for data assimilation and high-dimensional PDE solvers. Overall, ANaGRAM offers both theoretical insight and practical improvements for training PINNs on challenging PDE problems.

Abstract

In the recent years, Physics Informed Neural Networks (PINNs) have received strong interest as a method to solve PDE driven systems, in particular for data assimilation purpose. This method is still in its infancy, with many shortcomings and failures that remain not properly understood. In this paper we propose a natural gradient approach to PINNs which contributes to speed-up and improve the accuracy of the training. Based on an in depth analysis of the differential geometric structures of the problem, we come up with two distinct contributions: (i) a new natural gradient algorithm that scales as , where is the number of parameters, and the batch size; (ii) a mathematically principled reformulation of the PINNs problem that allows the extension of natural gradient to it, with proved connections to Green's function theory.

Paper Structure

This paper contains 52 sections, 13 theorems, 130 equations, 22 figures, 17 tables, 2 algorithms.

Key Result

Theorem 1

Let us define $\textrm{for all } 1\leq i\leq S$ and $\textrm{for all } 1\leq p\leq P$: Then: where $E_{\bm{\theta}}^\textrm{metric}$ and $E^\bot_{\bm{\theta}}$ are correction terms specified in eqn:E_metric_correctioneqn:E_bot_correction in subsec:consequences of NNTK-theory, respectively accounting for the metric's impact on empirical tangent space defintion, and the substraction of the ev

Figures (22)

  • Figure 1: Median absolute $L^2$ errors and Test losses for the 2 D Laplace equation.
  • Figure 2: Median absolute $L^2$ errors and Test losses for the Heat equation.
  • Figure 3: Median absolute $L^2$ errors and Test losses for the 5 D Laplace equation.
  • Figure 4: Median absolute $L^2$ errors and Test losses for the Allen-Cahn equation.
  • Figure 5: Median absolute $L^2$ errors and Test losses for the 2 D Laplace equation across 10 different initializations for the five optimizers, relative to computation time. The shaded area indicates the range between the first and third quartiles.
  • ...and 17 more figures

Theorems & Definitions (42)

  • Definition 1: Parametric model
  • Definition 2: Differential of a parametric model
  • Theorem 1: ANaGRAM
  • Remark 1
  • Proposition 1
  • Theorem 2
  • Remark 2
  • Definition 3
  • Theorem 3
  • Corollary 1
  • ...and 32 more