Table of Contents
Fetching ...

Can Neural Networks learn Finite Elements?

Julia Novo, Eduardo Terrés

TL;DR

The paper asks whether a neural Network can learn the linear FE solution for a 1D convection–diffusion problem by constructing a three-layer ReLU network that exactly represents FE basis functions. A cost function is devised so that the FE solution yields zero cost, and the authors test various training strategies, including full freedom and FE-informed initialization with selective weight fixing, across diffusion-dominated and convection-dominated regimes, including a SUPG-stabilized FE variant. The results show that without strong domain priors the optimization is highly over-determined and typically does not outperform standard FE; improvements occur when inner layers are fixed to mimic the FE representation, yielding piecewise-linear NN outputs that coincide with FE on the mesh. In the convection-dominated case, challenges such as enforcing the right boundary condition and capturing boundary layers highlight limitations of PINN-like approaches for classical PDEs, though the study aligns with prior work suggesting limited gains for low-dimensional problems and potential benefits in high-dimensional or data-assisted contexts. Overall, the work emphasizes the need for physics-informed structure in neural approaches to PDEs and provides a cautionary perspective on when neural networks can or cannot supplant traditional FE methods.

Abstract

The aim of this note is to construct a neural network for which the linear finite element approximation of a simple one dimensional boundary value problem is a minimum of the cost function to find out if the neural network is able to reproduce the finite element approximation. The deepest goal is to shed some light on the problems one encounters when trying to use neural networks to approximate partial differential equations

Can Neural Networks learn Finite Elements?

TL;DR

The paper asks whether a neural Network can learn the linear FE solution for a 1D convection–diffusion problem by constructing a three-layer ReLU network that exactly represents FE basis functions. A cost function is devised so that the FE solution yields zero cost, and the authors test various training strategies, including full freedom and FE-informed initialization with selective weight fixing, across diffusion-dominated and convection-dominated regimes, including a SUPG-stabilized FE variant. The results show that without strong domain priors the optimization is highly over-determined and typically does not outperform standard FE; improvements occur when inner layers are fixed to mimic the FE representation, yielding piecewise-linear NN outputs that coincide with FE on the mesh. In the convection-dominated case, challenges such as enforcing the right boundary condition and capturing boundary layers highlight limitations of PINN-like approaches for classical PDEs, though the study aligns with prior work suggesting limited gains for low-dimensional problems and potential benefits in high-dimensional or data-assisted contexts. Overall, the work emphasizes the need for physics-informed structure in neural approaches to PDEs and provides a cautionary perspective on when neural networks can or cannot supplant traditional FE methods.

Abstract

The aim of this note is to construct a neural network for which the linear finite element approximation of a simple one dimensional boundary value problem is a minimum of the cost function to find out if the neural network is able to reproduce the finite element approximation. The deepest goal is to shed some light on the problems one encounters when trying to use neural networks to approximate partial differential equations
Paper Structure (6 sections, 1 theorem, 13 equations, 9 figures, 2 tables)

This paper contains 6 sections, 1 theorem, 13 equations, 9 figures, 2 tables.

Key Result

Proposition 1

Let us define $W^{[2]}, b^{[2]}$ and $W^{[3]}$ by: where $W_i^{[2]}, b_i^{[2]}$ and $W_i^{[3]}$ are defined in W2b2 and W3. Then $F \equiv u_h$.

Figures (9)

  • Figure 1: $N=40$, $N_{\rm iter}=10^4$, $\eta=10^{-4}$, $\beta=10^{-4}$.
  • Figure 2: $N=20$, $N_{\rm iter}=3\times10^5$, $\eta=10^{-6}$, $\beta=0$.
  • Figure 3: $N=40$, $N_{\rm iter}=5\times10^5$, $\eta=10^{-7}$ and $\beta=0$ on the left and $N=100$, $N_{\rm iter}=3\times10^5$, $\eta=10^{-8}$ and $\beta=0$ on the right.
  • Figure 4: $N=20$, $N_{\rm iter}=2\times10^5$, $\eta=10^{-6}$, $\beta=0$.
  • Figure 5: Absolute value of errors between finite element and neural network approximations for $N=20, 40$ and $100$.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Definition 1
  • Proposition 1
  • Proof 1