Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies

Alena Kopaničáková; Hardik Kothari; George Em Karniadakis; Rolf Krause

Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies

Alena Kopaničáková, Hardik Kothari, George Em Karniadakis, Rolf Krause

TL;DR

The paper tackles the slow and often ill-conditioned training of physics-informed neural networks (PINNs) by introducing nonlinear right preconditioning based on Schwarz domain decomposition. By decomposing network parameters into layer-wise subnetworks and solving local problems, the authors build additive (ASPQN) and multiplicative (MSPQN) preconditioners that yield a preconditioned system $\mathcal{F}(\boldsymbol{\theta}) = \nabla \mathcal{L}(G(\boldsymbol{\theta}))$ and improved global updates for $\mathcal{L}$. Empirical results on Burgers', diffusion-advection, Klein-Gordon, and Allen-Cahn problems show that SPQN methods significantly accelerate convergence and deliver more accurate PDE solutions than standard $\text{Adam}$ and $\text{L-BFGS}$, with ASPQN offering substantial model-parallel speedups. The work provides a scalable framework for PINN training that leverages parallel subnetworks and nonlinear preconditioning, with potential applicability to other deep-learning tasks beyond PINNs.

Abstract

We propose to enhance the training of physics-informed neural networks (PINNs). To this aim, we introduce nonlinear additive and multiplicative preconditioning strategies for the widely used L-BFGS optimizer. The nonlinear preconditioners are constructed by utilizing the Schwarz domain-decomposition framework, where the parameters of the network are decomposed in a layer-wise manner. Through a series of numerical experiments, we demonstrate that both, additive and multiplicative preconditioners significantly improve the convergence of the standard L-BFGS optimizer, while providing more accurate solutions of the underlying partial differential equations. Moreover, the additive preconditioner is inherently parallel, thus giving rise to a novel approach to model parallelism.

Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies

TL;DR

and improved global updates for

. Empirical results on Burgers', diffusion-advection, Klein-Gordon, and Allen-Cahn problems show that SPQN methods significantly accelerate convergence and deliver more accurate PDE solutions than standard

and

, with ASPQN offering substantial model-parallel speedups. The work provides a scalable framework for PINN training that leverages parallel subnetworks and nonlinear preconditioning, with potential applicability to other deep-learning tasks beyond PINNs.

Abstract

Paper Structure (19 sections, 22 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 19 sections, 22 equations, 7 figures, 3 tables, 1 algorithm.

Introduction
Physics-informed neural networks
DNN approximation of solution
Training data and loss functional
Nonlinearly preconditioned training using layer-wise decomposition of network
Network decomposition
Right-preconditioned training optimizers
Right-preconditioned L-BFGS
Numerical experiments
Implementation, computational cost, and memory requirements
Implementation of PINNs
Implementation of optimizers
Configuration of SPQN optimizers
Configuration of state-of-the-art optimizers
Computational cost and memory requirements
...and 4 more sections

Figures (7)

Figure 1: An example of the layer-wise decomposition of the network.
Figure 1: A sketch of local-to-global updates utilized by the ASPQN.
Figure 1: Burgers' equation: The mean computational cost of the ASPQN method (left column) and MSPQN method (right column). The results are obtained for varying numbers of local iterations ($k_s \in \{ 10, 50, 100\}$) and varying number of subdomains ($N_{sd}\in \{ 2, 4, 8 \}$). Average is obtained over $10$ independent runs.
Figure 2: Klein-Gordon: The mean computational cost of the ASPQN method (left column) and MSPQN method (right column). The results are obtained for varying numbers of local iterations, i.e., ($k_s \in \{ 10, 50, 100\}$) and varying number of subdomains ($N_{sd}\in \{ 2, 3, 6 \}$). Average is obtained over $10$ independent runs.
Figure 3: Advection diffusion problem: The mean computational cost of the ASPQN method (left column) and MSPQN method (right column). The results are obtained for varying numbers of local iterations($k_s \in \{ 10, 50, 100\}$) and varying number of subdomains ($N_{sd}\in \{ 2, 5, 10 \}$). Average is obtained over $10$ independent runs.
...and 2 more figures

Theorems & Definitions (1)

Remark 5.1

Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies

TL;DR

Abstract

Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies

Authors

TL;DR

Abstract

Table of Contents

Figures (7)

Theorems & Definitions (1)