Table of Contents
Fetching ...

Two-hidden-layer ReLU neural networks and finite elements

Pengzhan Jin

TL;DR

This work addresses bridging ReLU neural networks and finite element function spaces by introducing a weak representation on convex polytope meshes, showing that two-hidden-layer networks can approximate piecewise linear FE functions with precisely computable neuron counts. It formalizes a constructive weak representation theorem for constant and linear FE, and demonstrates L^p approximation density by relating NN representations to FE spaces. In addition, it establishes a strict representation framework for tensor finite element functions using tensor neural networks, with explicit formulas and a concrete 2D example. Overall, the paper provides a computable, theory-grounded bridge between NN approximation and FE analysis, enabling rigorous L^p estimates and offering a path to efficient high-dimensional representations via tensor NNs.

Abstract

We point out that (continuous or discontinuous) piecewise linear functions on a convex polytope mesh can be represented by two-hidden-layer ReLU neural networks in a weak sense. In addition, the numbers of neurons of the two hidden layers required to weakly represent are accurately given based on the numbers of polytopes and hyperplanes involved in this mesh. The results naturally hold for constant and linear finite element functions. Such weak representation establishes a bridge between two-hidden-layer ReLU neural networks and finite element functions, and leads to a perspective for analyzing approximation capability of ReLU neural networks in $L^p$ norm via finite element functions. Moreover, we discuss the strict representation for tensor finite element functions via the recent tensor neural networks.

Two-hidden-layer ReLU neural networks and finite elements

TL;DR

This work addresses bridging ReLU neural networks and finite element function spaces by introducing a weak representation on convex polytope meshes, showing that two-hidden-layer networks can approximate piecewise linear FE functions with precisely computable neuron counts. It formalizes a constructive weak representation theorem for constant and linear FE, and demonstrates L^p approximation density by relating NN representations to FE spaces. In addition, it establishes a strict representation framework for tensor finite element functions using tensor neural networks, with explicit formulas and a concrete 2D example. Overall, the paper provides a computable, theory-grounded bridge between NN approximation and FE analysis, enabling rigorous L^p estimates and offering a path to efficient high-dimensional representations via tensor NNs.

Abstract

We point out that (continuous or discontinuous) piecewise linear functions on a convex polytope mesh can be represented by two-hidden-layer ReLU neural networks in a weak sense. In addition, the numbers of neurons of the two hidden layers required to weakly represent are accurately given based on the numbers of polytopes and hyperplanes involved in this mesh. The results naturally hold for constant and linear finite element functions. Such weak representation establishes a bridge between two-hidden-layer ReLU neural networks and finite element functions, and leads to a perspective for analyzing approximation capability of ReLU neural networks in norm via finite element functions. Moreover, we discuss the strict representation for tensor finite element functions via the recent tensor neural networks.
Paper Structure (9 sections, 8 theorems, 65 equations, 3 figures)

This paper contains 9 sections, 8 theorems, 65 equations, 3 figures.

Key Result

Theorem 1

${\rm FNN}(2H_\mathcal{T}^i+H_\mathcal{T}^b,N_\mathcal{T}+1)$ weakly represents $\mathcal{V}_\mathcal{T}$ on $\mathcal{T}$.

Figures (3)

  • Figure 1: A 2-d convex polygon mesh for constant finite element functions and its corresponding FNN size for weak representation.
  • Figure 2: A 2-d simplex mesh for linear finite element functions and its corresponding FNN size for weak representation.
  • Figure 3: A 2-d tensor-type mesh for continuous piecewise bilinear functions and its corresponding TNN size for strict representation.

Theorems & Definitions (18)

  • Definition 1
  • Theorem 1
  • Corollary 1
  • Corollary 2
  • Corollary 3
  • Lemma 1
  • proof
  • Remark 1
  • Remark 2
  • Remark 3
  • ...and 8 more