Table of Contents
Fetching ...

Generalized Transferable Neural Networks for Steady-State Partial Differential Equations

Tao Cheng, Lili Ju, Zhonghua Qiao, Xiaoping Zhang

Abstract

Deep learning has emerged as a compelling framework for scientific and engineering computing, motivating growing interest in neural network-based solvers for partial differential equations (PDEs). Within this landscape, network architectures with deterministic feature construction have become an appealing approach, offering both high accuracy and computational efficiency in practice. Among them, the transferable neural network (TransNet) is a special class of shallow neural networks (i.e., single-hidden-layer architectures), whose hidden-layer parameters are predetermined according to the principle of uniformly distributed partition hyperplanes. Although TransNet has demonstrated strong performance in solving PDEs with relatively smooth solutions, its accuracy and stability may deteriorate in the presence of highly oscillatory solution structures, where activation saturation and system conditioning issues become limiting factors. In this paper, we propose a generalized transferable neural network (GTransNet) for solving steady-state PDEs, which augments the original TransNet design with additional hidden layers while preserving its interpretable feature-generation mechanism. In particular, the first hidden layer of GTransNet retains TransNet's parameter sampling strategy but incorporates an additional symmetry constraint on the neuron biases, while the subsequent hidden layers omit bias terms and employ a variance-controlled sampling strategy for selecting neuron weights.

Generalized Transferable Neural Networks for Steady-State Partial Differential Equations

Abstract

Deep learning has emerged as a compelling framework for scientific and engineering computing, motivating growing interest in neural network-based solvers for partial differential equations (PDEs). Within this landscape, network architectures with deterministic feature construction have become an appealing approach, offering both high accuracy and computational efficiency in practice. Among them, the transferable neural network (TransNet) is a special class of shallow neural networks (i.e., single-hidden-layer architectures), whose hidden-layer parameters are predetermined according to the principle of uniformly distributed partition hyperplanes. Although TransNet has demonstrated strong performance in solving PDEs with relatively smooth solutions, its accuracy and stability may deteriorate in the presence of highly oscillatory solution structures, where activation saturation and system conditioning issues become limiting factors. In this paper, we propose a generalized transferable neural network (GTransNet) for solving steady-state PDEs, which augments the original TransNet design with additional hidden layers while preserving its interpretable feature-generation mechanism. In particular, the first hidden layer of GTransNet retains TransNet's parameter sampling strategy but incorporates an additional symmetry constraint on the neuron biases, while the subsequent hidden layers omit bias terms and employ a variance-controlled sampling strategy for selecting neuron weights.

Paper Structure

This paper contains 16 sections, 4 theorems, 47 equations, 16 figures.

Key Result

Theorem 1

If the direction vectors $\{\mathbf{a}_m\}_{m=1}^M$ are uniformly distributed on the unit sphere $S^{d-1}$ and the offset parameters $\{r_m\}_{m=1}^M$ are independent and uniformly distributed over the unit interval $[0,1]$, then for any point $\mathbf{y}\in B_1(\mathbf{0)}$ satisfying $\|\mathbf{y} $\blacktriangleleft$$\blacktriangleleft$

Figures (16)

  • Figure 1: Visualization over the square $(-1,1)^2$ of a sample hidden-layer neuron $\psi_m(\mathbf{x})$ in TransNet associated with $B_{1.5}(\mathbf{0})$ in two dimensions, where the dashed line indicates the corresponding partition hyperplane. From left to right: $\gamma$ = 2, 6, 14.
  • Figure 2: Distribution histograms of the hidden-layer neuron activation values in TransNet associated with $B_{1.5}(\mathbf{0})$ in two dimensions. Activation values are collected from 500 input points over the unit square $(-1,1)^2$, with each point corresponding to 1000 different hidden-layer neurons, illustrating how the response pattern and concentration vary along with the shape parameter $\gamma$. From left to right: $\gamma$ = 2, 6, 14.
  • Figure 3: Numerical results of fitting the high-frequency function $f(x) = \sin(30\pi x)$ in the interval $(-1,1)$ by using a TransNet associated with $B_{1.1}(\mathbf{0})$ in one dimension. 1000 uniformly distributed collocation points are used. From Left to right: $\gamma=2, 14$; From top to bottom: $M= 200, 1000$.
  • Figure 4: Network architecture of the proposed GTransNet \ref{['eq:g_transnet']} with $L$ hidden layers.
  • Figure 5: Distribution histograms of the hidden-layer neuron activation values in the proposed GTransNet associated with $B_{1.5}(\mathbf{0})$ in two dimensions. Activation values are collected from 500 input points over the unit square $(-1,1)^2$, with each point corresponding to 2000 different neurons in the first hidden layer and 1000 neurons in the second and third hidden layers. From left to right: $\gamma=2,6, 14.$ First row: the first hidden layer $\boldsymbol{\psi}_1$; Second and third rows: the second and third hidden layers $\boldsymbol{\psi}_2$ and $\boldsymbol{\psi}_3$ of the GTransNet with $\delta=0.5$; Fourth and fifth rows: the second and third hidden layers $\boldsymbol{\psi}_2$ and $\boldsymbol{\psi}_3$ of the GTransNet with $\delta=0.8$.
  • ...and 11 more figures

Theorems & Definitions (10)

  • Remark 1
  • Theorem 1: Uniform Neuron Distribution in $B_1(\mathbf{0)}$ TransNet
  • Remark 2
  • Lemma 1: Zero Mean of the First Hidden Layer
  • proof
  • Theorem 2: Zero Mean Propagation in GTransNet
  • proof
  • Theorem 3: Controlled Variance Propagation in GTransNet
  • proof
  • Remark 3