Table of Contents
Fetching ...

Do physics-informed neural networks (PINNs) need to be deep? Shallow PINNs using the Levenberg-Marquardt algorithm

Muhammad Luthfi Shahab, Imam Mukhlash, Hadi Susanto

TL;DR

This paper demonstrates that shallow physics-informed neural networks (PINNs) can achieve high-accuracy solutions to forward and inverse nonlinear PDEs when trained as nonlinear least-squares problems using the Levenberg–Marquardt (LM) algorithm. It derives exact analytical expressions for neural-network derivatives with respect to inputs and network parameters, enabling precise Jacobian construction and efficient LM updates. Across Burgers, nonlinear Schrödinger, Allen–Cahn, and Bratu problems, LM outperforms traditional BFGS in final loss and solution accuracy, often with orders-of-magnitude improvements while using only two hidden layers. The findings suggest that network depth is not strictly necessary for PINN accuracy when paired with robust second-order optimization and that shallow PINNs with LM offer a practical, computationally efficient route for forward/inverse PDE problems, even on CPU hardware.

Abstract

This work investigates the use of shallow physics-informed neural networks (PINNs) for solving forward and inverse problems of nonlinear partial differential equations (PDEs). By reformulating PINNs as nonlinear systems, the Levenberg-Marquardt (LM) algorithm is employed to efficiently optimize the network parameters. Analytical expressions for the neural network derivatives with respect to the input variables are derived, enabling accurate and efficient computation of the Jacobian matrix required by LM. The proposed approach is tested on several benchmark problems, including the Burgers, Schrödinger, Allen-Cahn, and three-dimensional Bratu equations. Numerical results demonstrate that LM significantly outperforms BFGS in terms of convergence speed, accuracy, and final loss values, even when using shallow network architectures with only two hidden layers. These findings indicate that, for a wide class of PDEs, shallow PINNs combined with efficient second-order optimization methods can provide accurate and computationally efficient solutions for both forward and inverse problems.

Do physics-informed neural networks (PINNs) need to be deep? Shallow PINNs using the Levenberg-Marquardt algorithm

TL;DR

This paper demonstrates that shallow physics-informed neural networks (PINNs) can achieve high-accuracy solutions to forward and inverse nonlinear PDEs when trained as nonlinear least-squares problems using the Levenberg–Marquardt (LM) algorithm. It derives exact analytical expressions for neural-network derivatives with respect to inputs and network parameters, enabling precise Jacobian construction and efficient LM updates. Across Burgers, nonlinear Schrödinger, Allen–Cahn, and Bratu problems, LM outperforms traditional BFGS in final loss and solution accuracy, often with orders-of-magnitude improvements while using only two hidden layers. The findings suggest that network depth is not strictly necessary for PINN accuracy when paired with robust second-order optimization and that shallow PINNs with LM offer a practical, computationally efficient route for forward/inverse PDE problems, even on CPU hardware.

Abstract

This work investigates the use of shallow physics-informed neural networks (PINNs) for solving forward and inverse problems of nonlinear partial differential equations (PDEs). By reformulating PINNs as nonlinear systems, the Levenberg-Marquardt (LM) algorithm is employed to efficiently optimize the network parameters. Analytical expressions for the neural network derivatives with respect to the input variables are derived, enabling accurate and efficient computation of the Jacobian matrix required by LM. The proposed approach is tested on several benchmark problems, including the Burgers, Schrödinger, Allen-Cahn, and three-dimensional Bratu equations. Numerical results demonstrate that LM significantly outperforms BFGS in terms of convergence speed, accuracy, and final loss values, even when using shallow network architectures with only two hidden layers. These findings indicate that, for a wide class of PDEs, shallow PINNs combined with efficient second-order optimization methods can provide accurate and computationally efficient solutions for both forward and inverse problems.
Paper Structure (15 sections, 57 equations, 5 figures, 9 tables)

This paper contains 15 sections, 57 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Explicit representation of the internal PINN architecture, illustrating how the core network $\tilde{u}$ (yellow) and its spatial and temporal derivatives are analytically constructed from shared weights and biases.
  • Figure 2: (a) Neural network solution of the Burgers equation obtained using LM. (b) Solutions computed by LM and BFGS at selected time slices. (c) Absolute errors corresponding to the LM and BFGS solutions. (d) Training loss histories of LM and BFGS.
  • Figure 3: (a) Neural network solution of the nonlinear Schrödinger equation obtained using LM. (b) Solutions computed by LM and BFGS at selected time slices. (c) Absolute errors corresponding to the LM and BFGS solutions. (d) Training loss histories of LM and BFGS.
  • Figure 4: (a) Neural network solution of the Allen--Cahn equation obtained using LM. (b) Solutions computed by LM and BFGS at selected time slices. (c) Absolute errors corresponding to the LM and BFGS solutions. (d) Training loss histories of LM and BFGS. (e) Evolution of the estimated parameters during training for LM and BFGS.
  • Figure 5: (a) Neural network solution of the three-dimensional Bratu equation obtained using LM. (b) Solutions computed by LM and BFGS at selected time slices. (c) Absolute errors corresponding to the LM and BFGS solutions. (d) Training loss histories of LM and BFGS. (e) Evolution of the estimated parameters during training for LM and BFGS.