Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks

Felix Dangel; Johannes Müller; Marius Zeinhofer

Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks

Felix Dangel, Johannes Müller, Marius Zeinhofer

TL;DR

Empirically, the proposed Kronecker-factored approximate curvature (KFAC) based optimizers are competitive with expensive second-order methods on small problems, scale more favorably to higher-dimensional neural networks and PDEs, and consistently outperform first-order methods and LBFGS.

Abstract

Physics-informed neural networks (PINNs) are infamous for being hard to train. Recently, second-order methods based on natural gradient and Gauss-Newton methods have shown promising performance, improving the accuracy achieved by first-order methods by several orders of magnitude. While promising, the proposed methods only scale to networks with a few thousand parameters due to the high computational cost to evaluate, store, and invert the curvature matrix. We propose Kronecker-factored approximate curvature (KFAC) for PINN losses that greatly reduces the computational cost and allows scaling to much larger networks. Our approach goes beyond the established KFAC for traditional deep learning problems as it captures contributions from a PDE's differential operator that are crucial for optimization. To establish KFAC for such losses, we use Taylor-mode automatic differentiation to describe the differential operator's computation graph as a forward network with shared weights. This allows us to apply KFAC thanks to a recently-developed general formulation for networks with weight sharing. Empirically, we find that our KFAC-based optimizers are competitive with expensive second-order methods on small problems, scale more favorably to higher-dimensional neural networks and PDEs, and consistently outperform first-order methods and LBFGS.

Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks

TL;DR

Abstract

Paper Structure (83 sections, 71 equations, 19 figures, 2 tables, 1 algorithm)

This paper contains 83 sections, 71 equations, 19 figures, 2 tables, 1 algorithm.

Introduction
Related work
Background
Flattening & Derivatives
Sequential neural nets
Energy Natural Gradients for Physics-Informed Neural Networks
Kronecker-factored Approximate Curvature
Kronecker-Factored Approximate Curvature for PINNs
Higher-order Forward Mode Automatic Differentiation as Weight Sharing
Forward Laplacian
KFAC for Gauss-Newton Matrices with the Laplace Operator
KFAC for Generalized Gauss-Newton Matrices Involving General PDE Terms
Algorithmic Details
Exponential moving average and damping
Gradient preconditioning
...and 68 more sections

Figures (19)

Figure 1: Performance of different optimizers on the 2d Poisson equation \ref{['eq:2D-Poisson']} measured in relative $L_2$ error against wall clock time for architectures with different parameter dimensions $D$.
Figure 2: Performance of different optimizers on the (4+1)d heat equation \ref{['eq:4D-heat']} measured in relative $L^2$ error against wall clock time for architectures with different parameter dimensions $D$.
Figure 3: Optimizer performance on Poisson equations in high dimensions and different boundary conditions measured in relative $L_2$ error against wall clock time for networks with $D$ parameters.
Figure 4: Performance of different optimizers on a (9+1)d logarithmic Fokker-Planck equation in relative $L_2$ error against wall clock time.
Figure A5: Training loss and evaluation $L_2$ error for learning the solution to a 2d Poisson equation over (\ref{['subfig:poisson2d-time']}) time and (\ref{['subfig:poisson2d-step']}) steps. Columns are different neural networks.
...and 14 more figures

Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks

TL;DR

Abstract

Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (19)