Table of Contents
Fetching ...

Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies

Alena Kopaničáková, Hardik Kothari, George Em Karniadakis, Rolf Krause

TL;DR

The paper tackles the slow and often ill-conditioned training of physics-informed neural networks (PINNs) by introducing nonlinear right preconditioning based on Schwarz domain decomposition. By decomposing network parameters into layer-wise subnetworks and solving local problems, the authors build additive (ASPQN) and multiplicative (MSPQN) preconditioners that yield a preconditioned system $\mathcal{F}(\boldsymbol{\theta}) = \nabla \mathcal{L}(G(\boldsymbol{\theta}))$ and improved global updates for $\mathcal{L}$. Empirical results on Burgers', diffusion-advection, Klein-Gordon, and Allen-Cahn problems show that SPQN methods significantly accelerate convergence and deliver more accurate PDE solutions than standard $\text{Adam}$ and $\text{L-BFGS}$, with ASPQN offering substantial model-parallel speedups. The work provides a scalable framework for PINN training that leverages parallel subnetworks and nonlinear preconditioning, with potential applicability to other deep-learning tasks beyond PINNs.

Abstract

We propose to enhance the training of physics-informed neural networks (PINNs). To this aim, we introduce nonlinear additive and multiplicative preconditioning strategies for the widely used L-BFGS optimizer. The nonlinear preconditioners are constructed by utilizing the Schwarz domain-decomposition framework, where the parameters of the network are decomposed in a layer-wise manner. Through a series of numerical experiments, we demonstrate that both, additive and multiplicative preconditioners significantly improve the convergence of the standard L-BFGS optimizer, while providing more accurate solutions of the underlying partial differential equations. Moreover, the additive preconditioner is inherently parallel, thus giving rise to a novel approach to model parallelism.

Enhancing training of physics-informed neural networks using domain-decomposition based preconditioning strategies

TL;DR

The paper tackles the slow and often ill-conditioned training of physics-informed neural networks (PINNs) by introducing nonlinear right preconditioning based on Schwarz domain decomposition. By decomposing network parameters into layer-wise subnetworks and solving local problems, the authors build additive (ASPQN) and multiplicative (MSPQN) preconditioners that yield a preconditioned system and improved global updates for . Empirical results on Burgers', diffusion-advection, Klein-Gordon, and Allen-Cahn problems show that SPQN methods significantly accelerate convergence and deliver more accurate PDE solutions than standard and , with ASPQN offering substantial model-parallel speedups. The work provides a scalable framework for PINN training that leverages parallel subnetworks and nonlinear preconditioning, with potential applicability to other deep-learning tasks beyond PINNs.

Abstract

We propose to enhance the training of physics-informed neural networks (PINNs). To this aim, we introduce nonlinear additive and multiplicative preconditioning strategies for the widely used L-BFGS optimizer. The nonlinear preconditioners are constructed by utilizing the Schwarz domain-decomposition framework, where the parameters of the network are decomposed in a layer-wise manner. Through a series of numerical experiments, we demonstrate that both, additive and multiplicative preconditioners significantly improve the convergence of the standard L-BFGS optimizer, while providing more accurate solutions of the underlying partial differential equations. Moreover, the additive preconditioner is inherently parallel, thus giving rise to a novel approach to model parallelism.
Paper Structure (19 sections, 22 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 19 sections, 22 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: An example of the layer-wise decomposition of the network.
  • Figure 1: A sketch of local-to-global updates utilized by the ASPQN.
  • Figure 1: Burgers' equation: The mean computational cost of the ASPQN method (left column) and MSPQN method (right column). The results are obtained for varying numbers of local iterations ($k_s \in \{ 10, 50, 100\}$) and varying number of subdomains ($N_{sd}\in \{ 2, 4, 8 \}$). Average is obtained over $10$ independent runs.
  • Figure 2: Klein-Gordon: The mean computational cost of the ASPQN method (left column) and MSPQN method (right column). The results are obtained for varying numbers of local iterations, i.e., ($k_s \in \{ 10, 50, 100\}$) and varying number of subdomains ($N_{sd}\in \{ 2, 3, 6 \}$). Average is obtained over $10$ independent runs.
  • Figure 3: Advection diffusion problem: The mean computational cost of the ASPQN method (left column) and MSPQN method (right column). The results are obtained for varying numbers of local iterations($k_s \in \{ 10, 50, 100\}$) and varying number of subdomains ($N_{sd}\in \{ 2, 5, 10 \}$). Average is obtained over $10$ independent runs.
  • ...and 2 more figures

Theorems & Definitions (1)

  • Remark 5.1