Table of Contents
Fetching ...

TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision

Zhuo Chen, Jacob McCarran, Esteban Vizcaino, Marin Soljačić, Di Luo

TL;DR

The paper addresses accurate neural PDE solving, particularly for initial-value problems, by introducing Time-Evolving Natural Gradient (TENG), which unifies time-dependent variational principles and optimization-based time integration through repeated $u$-space optimizations with tangent-space projections. By deriving efficient algorithms (TENG-Euler and high-order variants like TENG-Heun) and leveraging sparse updates, TENG achieves machine-precision per-step optimization across PDEs such as the heat, Allen-Cahn, and Burgers equations, outperforming TDVP, OBTI, and PINN baselines in accuracy with competitive runtimes. The work also provides a complexity and error analysis, including a reparameterization-invariance result and connections to Gauss-Newton, and demonstrates substantial empirical gains on multi-dimensional benchmarks. Overall, TENG offers a practical, high-precision framework for neural PDE solvers with potential for broad scientific impact and extension to more complex, real-world problems.

Abstract

Partial differential equations (PDEs) are instrumental for modeling dynamical systems in science and engineering. The advent of neural networks has initiated a significant shift in tackling these complexities though challenges in accuracy persist, especially for initial value problems. In this paper, we introduce the $\textit{Time-Evolving Natural Gradient (TENG)}$, generalizing time-dependent variational principles and optimization-based time integration, leveraging natural gradient optimization to obtain high accuracy in neural-network-based PDE solutions. Our comprehensive development includes algorithms like TENG-Euler and its high-order variants, such as TENG-Heun, tailored for enhanced precision and efficiency. TENG's effectiveness is further validated through its performance, surpassing current leading methods and achieving $\textit{machine precision}$ in step-by-step optimizations across a spectrum of PDEs, including the heat equation, Allen-Cahn equation, and Burgers' equation.

TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision

TL;DR

The paper addresses accurate neural PDE solving, particularly for initial-value problems, by introducing Time-Evolving Natural Gradient (TENG), which unifies time-dependent variational principles and optimization-based time integration through repeated -space optimizations with tangent-space projections. By deriving efficient algorithms (TENG-Euler and high-order variants like TENG-Heun) and leveraging sparse updates, TENG achieves machine-precision per-step optimization across PDEs such as the heat, Allen-Cahn, and Burgers equations, outperforming TDVP, OBTI, and PINN baselines in accuracy with competitive runtimes. The work also provides a complexity and error analysis, including a reparameterization-invariance result and connections to Gauss-Newton, and demonstrates substantial empirical gains on multi-dimensional benchmarks. Overall, TENG offers a practical, high-precision framework for neural PDE solvers with potential for broad scientific impact and extension to more complex, real-world problems.

Abstract

Partial differential equations (PDEs) are instrumental for modeling dynamical systems in science and engineering. The advent of neural networks has initiated a significant shift in tackling these complexities though challenges in accuracy persist, especially for initial value problems. In this paper, we introduce the , generalizing time-dependent variational principles and optimization-based time integration, leveraging natural gradient optimization to obtain high accuracy in neural-network-based PDE solutions. Our comprehensive development includes algorithms like TENG-Euler and its high-order variants, such as TENG-Heun, tailored for enhanced precision and efficiency. TENG's effectiveness is further validated through its performance, surpassing current leading methods and achieving in step-by-step optimizations across a spectrum of PDEs, including the heat equation, Allen-Cahn equation, and Burgers' equation.
Paper Structure (18 sections, 4 theorems, 35 equations, 13 figures, 3 tables, 3 algorithms)

This paper contains 18 sections, 4 theorems, 35 equations, 13 figures, 3 tables, 3 algorithms.

Key Result

Theorem 4.1

The optimal solution of TENG is reparameterization invariant even with nonzero $\Delta t$ .

Figures (13)

  • Figure 1: TENG generalizes the existing TDVP and OBTI methods. Within a single time step, TDVP projects the update direction $\mathcal{L}\hat{u}_{\theta_t}$ onto the tangent space of the neural network manifold $T_{\hat{u}_{\theta_t}}\mathcal{M}_\Theta$ at time $t$, and evolves the parameters $\theta$ according to this tangent space projection. OBTI optimizes $\theta$ to obtain an approximation to the target function $\hat{u}_{\theta_t} + \Delta t \mathcal{L} \hat{u}_{\theta_t}$ on the manifold $\mathcal{M}_\Theta$. Generalizing these two methods, TENG defines the loss function directly in the $u$-space and optimizes the loss function via repeated projections to the tangent space $T_{\hat{u}_{\theta_t}}\mathcal{M}_\Theta$.
  • Figure 2: Benchmark of TENG, in terms of relative $L^2$-error as a function of time, against various algorithms on two- and three-dimensional heat equations, Allen--Cahn equation and Burgers' equation. All sequential-in-time methods use the same time step size $\Delta t = 0.005$ for heat and Allen--Cahn equations and $\Delta t = 0.001$ for Burgers' equation.
  • Figure 3: Reference solution, TENG solution, and the difference between them for Burgers' equation. The reference solution is generated using the spectral method, and the TENG solution shown here uses the TENG-Heun method with $\Delta t = 0.001$.
  • Figure 4: Training loss during the time step at $T=1$ and final training losses for all time steps for the TENG-Euler method and the two OBTI methods for Allen--Cahn equation.
  • Figure 5: Comparison of different time integration schemes of TENG with respect to the time step sizes on Allen--Cahn equation and Burgers' equation, using global relative $L^2$-error as a metric.
  • ...and 8 more figures

Theorems & Definitions (9)

  • proof
  • Theorem 4.1
  • proof
  • Theorem 4.2
  • proof
  • Theorem 1.1
  • proof
  • Theorem 1.2
  • proof