Table of Contents
Fetching ...

Error Analysis and Numerical Algorithm for PDE Approximation with Hidden-Layer Concatenated Physics Informed Neural Networks

Yianxia Qian, Yongchao Zhang, Suchuan Dong

TL;DR

The paper introduces HLConcPINN, a physics-informed neural network framework that uses hidden-layer concatenation and an extended block time marching scheme to solve parabolic and hyperbolic PDEs with provable error control. By enabling arbitrary network depth (≥2) and general smooth activations beyond the first two layers, HLConcPINN preserves representation capacity and theoretical guarantees. The authors prove residual decay and total error bounds that relate the solution error to training and quadrature errors, and validate the theory with extensive numerical experiments on the heat, Burgers', wave, and nonlinear Klein-Gordon equations. The approach significantly improves long-time accuracy and offers a versatile, theory-backed alternative to standard PINN formulations for complex, time-dependent PDEs.

Abstract

We present the hidden-layer concatenated physics informed neural network (HLConcPINN) method, which combines hidden-layer concatenated feed-forward neural networks, a modified block time marching strategy, and a physics informed approach for approximating partial differential equations (PDEs). We analyze the convergence properties and establish the error bounds of this method for two types of PDEs: parabolic (exemplified by the heat and Burgers' equations) and hyperbolic (exemplified by the wave and nonlinear Klein-Gordon equations). We show that its approximation error of the solution can be effectively controlled by the training loss for dynamic simulations with long time horizons. The HLConcPINN method in principle allows an arbitrary number of hidden layers not smaller than two and any of the commonly-used smooth activation functions for the hidden layers beyond the first two, with theoretical guarantees. This generalizes several recent neural-network techniques, which have theoretical guarantees but are confined to two hidden layers in the network architecture and the $\tanh$ activation function. Our theoretical analyses subsequently inform the formulation of appropriate training loss functions for these PDEs, leading to physics informed neural network (PINN) type computational algorithms that differ from the standard PINN formulation. Ample numerical experiments are presented based on the proposed algorithm to validate the effectiveness of this method and confirm aspects of the theoretical analyses.

Error Analysis and Numerical Algorithm for PDE Approximation with Hidden-Layer Concatenated Physics Informed Neural Networks

TL;DR

The paper introduces HLConcPINN, a physics-informed neural network framework that uses hidden-layer concatenation and an extended block time marching scheme to solve parabolic and hyperbolic PDEs with provable error control. By enabling arbitrary network depth (≥2) and general smooth activations beyond the first two layers, HLConcPINN preserves representation capacity and theoretical guarantees. The authors prove residual decay and total error bounds that relate the solution error to training and quadrature errors, and validate the theory with extensive numerical experiments on the heat, Burgers', wave, and nonlinear Klein-Gordon equations. The approach significantly improves long-time accuracy and offers a versatile, theory-backed alternative to standard PINN formulations for complex, time-dependent PDEs.

Abstract

We present the hidden-layer concatenated physics informed neural network (HLConcPINN) method, which combines hidden-layer concatenated feed-forward neural networks, a modified block time marching strategy, and a physics informed approach for approximating partial differential equations (PDEs). We analyze the convergence properties and establish the error bounds of this method for two types of PDEs: parabolic (exemplified by the heat and Burgers' equations) and hyperbolic (exemplified by the wave and nonlinear Klein-Gordon equations). We show that its approximation error of the solution can be effectively controlled by the training loss for dynamic simulations with long time horizons. The HLConcPINN method in principle allows an arbitrary number of hidden layers not smaller than two and any of the commonly-used smooth activation functions for the hidden layers beyond the first two, with theoretical guarantees. This generalizes several recent neural-network techniques, which have theoretical guarantees but are confined to two hidden layers in the network architecture and the activation function. Our theoretical analyses subsequently inform the formulation of appropriate training loss functions for these PDEs, leading to physics informed neural network (PINN) type computational algorithms that differ from the standard PINN formulation. Ample numerical experiments are presented based on the proposed algorithm to validate the effectiveness of this method and confirm aspects of the theoretical analyses.
Paper Structure (32 sections, 19 theorems, 100 equations, 18 figures, 10 tables)

This paper contains 32 sections, 19 theorems, 100 equations, 18 figures, 10 tables.

Key Result

Theorem 3.2

Let $\widetilde{\Omega}_i = D\times [0,t_i]$ and $\widetilde{\Omega}_{*i}= \partial D\times [0,t_i]$. Suppose $n$, $d$, $k \in \mathbb{N}$ with $n\geq2$ and $k\geq 3$, and $u\in H^k(\widetilde{\Omega}_i)$. For every integer $N>5$, there exists a HLConcPINN $u_{\theta_i}$ such that

Figures (18)

  • Figure 1: Illustration of network structures (with 3 hidden layers) for conventional and hidden-layer concatenated neural networks. In hidden-layer concatenated FNN, all the hidden nodes are exposed to the output nodes, while in conventional FNN only the last hidden-layer nodes are exposed to the output nodes.
  • Figure 2: Illustration of the block time marching (BTM) strategy. The large time domain is partitioned into multiple blocks, with each block computed individually and successively. Solution in one block informs the initial condition for the subsequent time block.
  • Figure 3: Heat equation: Distributions of the true solution (a), the HLConcPINN-ExBTM solution (b) and its point-wise absolute error (c), the HLConcPINN-BTM solution (d) (denoted by $u^*_\theta$) and its point-wise absolute error (e), in the spacial-temporal domain. NN architecture: [2, 90, 90, 10, 1], with the $\tanh$ activation function; $N_c=2000$ for the collocation points.
  • Figure 4: Heat equation: Top row, comparison of profiles of the true solution, HLConcPINN-ExBTM solution, and HLConcPINN-BTM solution at several time instants. Bottom row, profiles of the absolute error of the HLConcPINN-ExBTM and HLConcPINN-BTM solutions. NN architecture: [2, 90, 90, 10, 1] with $\tanh$ activation function; $N_c=2000$ for the training collocation points.
  • Figure 5: Heat equation: Training loss versus the training iterations for different time blocks with the (a) HLConcPINN-ExBTM and (b) HLConcPINN-BTM methods. NN architecture: [2, 90, 90, 10, 1], $\tanh$ activation function; $N_c=2000$ for the training collocation points. The legend shows the time block index, with e.g. $T_{\#2}$ denoting the second time block.
  • ...and 13 more figures

Theorems & Definitions (35)

  • Definition 2.1
  • Remark 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Theorem 4.1
  • proof
  • Theorem 4.2
  • proof
  • Theorem 4.3
  • ...and 25 more