Invariant deep neural networks under the finite group for solving partial differential equations

Zhi-Yong Zhang; Jie-Ying Li; Lei-Lei Guo

Invariant deep neural networks under the finite group for solving partial differential equations

Zhi-Yong Zhang, Jie-Ying Li, Lei-Lei Guo

TL;DR

The paper tackles the limited extrapolation and accuracy of physics-informed neural networks (PINN) for solving PDEs. It introduces symmetry-enhanced neural networks (sDNN) that enforce invariance under a finite group G by modifying the architecture—either expanding weight/bias dimensions by the group order $|G|$ when a matrix representation exists or extending the input data otherwise—leading to a parameter reduction roughly by $|G|$. The authors prove invariance and a universal-derivative-approximation property for sDNN and demonstrate through extensive PDE experiments (advection, sine-Gordon, Poisson, nonlinear wave, KdV) that sDNN achieves superior extrapolation, preserves symmetry, and remains data-efficient compared with vanilla PINN. This work suggests that embedding finite-group symmetry directly into neural architectures can substantially improve both the reliability and efficiency of PDE solvers in physics-informed learning contexts.

Abstract

Utilizing physics-informed neural networks (PINN) to solve partial differential equations (PDEs) becomes a hot issue and also shows its great powers, but still suffers from the dilemmas of limited predicted accuracy in the sampling domain and poor prediction ability beyond the sampling domain which are usually mitigated by adding the physical properties of PDEs into the loss function or by employing smart techniques to change the form of loss function for special PDEs. In this paper, we design a symmetry-enhanced deep neural network (sDNN) which makes the architecture of neural networks invariant under the finite group through expanding the dimensions of weight matrixes and bias vectors in each hidden layers by the order of finite group if the group has matrix representations, otherwise extending the set of input data and the hidden layers except for the first hidden layer by the order of finite group. However, the total number of training parameters is only about one over the order of finite group of the original PINN size due to the symmetric architecture of sDNN. Furthermore, we give special forms of weight matrixes and bias vectors of sDNN, and rigorously prove that the architecture itself is invariant under the finite group and the sDNN has the universal approximation ability to learn the function keeping the finite group. Numerical results show that the sDNN has strong predicted abilities in and beyond the sampling domain and performs far better than the vanilla PINN with fewer training points and simpler architecture.

Invariant deep neural networks under the finite group for solving partial differential equations

TL;DR

when a matrix representation exists or extending the input data otherwise—leading to a parameter reduction roughly by

. The authors prove invariance and a universal-derivative-approximation property for sDNN and demonstrate through extensive PDE experiments (advection, sine-Gordon, Poisson, nonlinear wave, KdV) that sDNN achieves superior extrapolation, preserves symmetry, and remains data-efficient compared with vanilla PINN. This work suggests that embedding finite-group symmetry directly into neural architectures can substantially improve both the reliability and efficiency of PDE solvers in physics-informed learning contexts.

Abstract

Paper Structure (15 sections, 11 theorems, 53 equations, 21 figures, 1 algorithm)

This paper contains 15 sections, 11 theorems, 53 equations, 21 figures, 1 algorithm.

Introduction
Problem formulation
PINN for solving PDEs
Poor solution extrapolation ability of PINN
Invariant neural networks under a finite group
Construct the invariant neural networks
Approximation Theory for sDNN
Algorithm of sDNN
Numerical results
Advection equation
sine-Gordon equation
Poisson equation
A nonlinear wave equation
Korteweg-de Vries equation
Conclusion

Key Result

Theorem 3.1

Suppose that Eq.(eqn1) is admitted by a finite group $G:=\{g_0(=e),g_1,g_2,\dots,g_{n-1}\}$ of order $n$, $\textbf{x}^{(0)}$ be a set of input data, ${\textbf{w}}_l$ and ${\textbf{b}}_l\,(l=1,\dots,L)$ be the initialized weight matrixes and bias vector of the $l$-th hidden layer of the PINN. Then a where ${\textbf{w}}_1$ is an initialized weight matrix with $n_1$ rows and $2$ columns. 2). In the

Figures (21)

Figure 1: (Color online) Comparison of the PINN and sDNN for learning function $f(x)$ in (\ref{['example']}) with different $\alpha$, where $[-1,0]$ is the sampling domain and $[-1,1]$ is the prediction domain.
Figure 2: (Color online) Schematic diagrams of the PINN, sDNN with and without matrix representation: (A) PINN. (B) sDNN without matrix representation. (C) sDNN with matrix representation.
Figure 3: (Color online) Advection equation: Comparisons of $L_2$ relative errors and the even metric of $u$ in the sampling and prediction domains by the two methods. Keeping the number of neurons invariant and varying the number of collocation points: (A) the $L_2$ relative errors; (B) the symmetry metrics. Keeping the number of collocation points invariant and varying the number of neurons: (C) the $L_2$ relative errors; (D) the symmetry metrics. Each line represents the mean of five independent experiments. Note that the sDNN and PINN with subscript $1$ and $2$ of denote the sampling domain and prediction domain respectively, and the symbol 'WD PINN' means the training in the whole domain.
Figure 4: (Color online) Advection equation: Comparisons of absolute errors of PINN and sDNN in the first quadrant and the third quadrant. (A) Abloute errors by sDNN in the first quadrant. (B) Abloute errors by PINN in the first quadrant. (C) Abloute errors by sDNN in the third quadrant. (D) Abloute errors by PINN in the third quadrant.
Figure 5: (Color online) Advection equation: Cross sections of predicted solutions by PINN and sDNN and exact solutions. (A) Schematic diagrams of sampling and prediction domains where the green lines represent the two values of $t$ in cross section B and the yellow lines represent the two values of $x$ in cross section C. (B) Cross sections for $t=\pm0.5$: $t=0.5$ for $x\geq0$; $t=-0.5$ for $x < 0$. (C) Cross sections for $x=\pm1.00$: $x=1.00$ for $t\geq 0$; $x=-1.00$ for $t < 0$. (D) Loss histories for PINN and sDNN against the number of iterations.
...and 16 more figures

Theorems & Definitions (11)

Theorem 3.1
Corollary 3.2
Corollary 3.3
Theorem 3.4: Universal derivative approximation theory Pin-1999
Lemma 3.5: Multivariate Faà di Bruno formula Con-1996
Theorem 3.6
Proposition 4.1: Even neural network
Proposition 4.2: Circulant symmetry
Proposition 4.3
Proposition 4.4
...and 1 more

Invariant deep neural networks under the finite group for solving partial differential equations

TL;DR

Abstract

Invariant deep neural networks under the finite group for solving partial differential equations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (11)