Table of Contents
Fetching ...

Deep Neural Networks with General Activations: Super-Convergence in Sobolev Norms

Yahong Yang, Juncai He

Abstract

This paper establishes a comprehensive approximation result for deep fully-connected neural networks with commonly-used and general activation functions in Sobolev spaces $W^{n,\infty}$, with errors measured in the $W^{m,p}$-norm for $m < n$ and $1\le p \le \infty$. The derived rates surpass those of classical numerical approximation techniques, such as finite element and spectral methods, exhibiting a phenomenon we refer to as \emph{super-convergence}. Our analysis shows that deep networks with general activations can approximate weak solutions of partial differential equations (PDEs) with superior accuracy compared to traditional numerical methods at the approximation level. Furthermore, this work closes a significant gap in the error-estimation theory for neural-network-based approaches to PDEs, offering a unified theoretical foundation for their use in scientific computing.

Deep Neural Networks with General Activations: Super-Convergence in Sobolev Norms

Abstract

This paper establishes a comprehensive approximation result for deep fully-connected neural networks with commonly-used and general activation functions in Sobolev spaces , with errors measured in the -norm for and . The derived rates surpass those of classical numerical approximation techniques, such as finite element and spectral methods, exhibiting a phenomenon we refer to as \emph{super-convergence}. Our analysis shows that deep networks with general activations can approximate weak solutions of partial differential equations (PDEs) with superior accuracy compared to traditional numerical methods at the approximation level. Furthermore, this work closes a significant gap in the error-estimation theory for neural-network-based approaches to PDEs, offering a unified theoretical foundation for their use in scientific computing.

Paper Structure

This paper contains 20 sections, 25 theorems, 361 equations, 7 figures, 1 table.

Key Result

Theorem 3

Suppose that one of the following holds: Then neural networks with activation $\sigma$ achieve super-convergence rates in $W^{m,\infty}$. More precisely, for any $f \in W^{n,\infty}(\Omega)$ with $0 \le m < n$, and for any $N, L \in \mathbb{N}_+$, there exists a $\sigma$-neural network $\phi$ with depth $C_1 L \log L$ and width $C_2 N \log where the constants $C_0, C_1, C_2 > 0$ are independent o

Figures (7)

  • Figure 1: (a) The ReLU activation function. (b) Approximation of $u$ and $u'$ by a deep ReLU network for the Poisson problem \ref{['eq:poisson']}.
  • Figure 2: Network architecture used to approximate $x_1x_2\ldots x_d$.
  • Figure 3: An alternative network architecture used to approximate $x_1x_2\ldots x_d$.
  • Figure 4: Plot of $s(x)$ for $m=1$ and $m=5$.
  • Figure 5: Illustration of the domains $\Omega_{\bm{m}}$ and the associated partition of unity functions $s_{\bm{m}}$ in the case $J=3$ and $d=2$.
  • ...and 2 more figures

Theorems & Definitions (36)

  • Remark 1
  • Remark 2
  • Theorem 3
  • Remark 4
  • Definition 5: Sobolev Spaces
  • Lemma 6
  • Lemma 7
  • Lemma 8
  • Lemma 9
  • Lemma 10
  • ...and 26 more