Error analysis for hybrid finite element/neural network discretizations

Uladzislau Kapustsin; Utku Kaya; Johannes Pfefferer; Thomas Richter

Error analysis for hybrid finite element/neural network discretizations

Uladzislau Kapustsin, Utku Kaya, Johannes Pfefferer, Thomas Richter

TL;DR

The paper addresses efficiently solving Poisson-type PDEs by augmenting a coarse finite element solution with locally applied neural network corrections to capture fine-scale fluctuations, forming a hybrid FE–NN solver within a DNN-MG-inspired framework. It develops a rigorous a priori error analysis for both global and local update schemes and proves stability bounds for the neural network updates, supported by numerical experiments that demonstrate error is governed by the richness and quality of training data as well as network stability. The results show that patch-wise, domain-agnostic corrections can achieve fine-scale accuracy with modest online cost and that careful data preparation and architecture selection are crucial for performance. The work highlights practical potential for fast, robust PDE solvers with generalization across domains, while identifying limitations in highly singular geometries and indicating directions for future extensions, including time-dependent problems.

Abstract

We describe and analyze a hybrid finite element/neural network method for predicting solutions of partial differential equations. The methodology is designed for obtaining fine scale fluctuations from neural networks in a local manner. The network is capable of locally correcting a coarse finite element solution towards a fine solution taking the source term and the coarse approximation as input. Key observation is the dependency between quality of predictions and the size of training set which consists of different source terms and corresponding fine & coarse solutions. We provide the a priori error analysis of the method together with the stability analysis of the neural network. The numerical experiments confirm the capability of the network predicting fine finite element solutions. We also illustrate the generalization of the method to problems where test and training domains differ from each other.

Error analysis for hybrid finite element/neural network discretizations

TL;DR

Abstract

Paper Structure (20 sections, 4 theorems, 72 equations, 10 figures, 4 tables)

This paper contains 20 sections, 4 theorems, 72 equations, 10 figures, 4 tables.

Introduction and motivation
Preliminaries
Finite element discretization
Notation
Hybrid finite element neural network discretization
Training of the neural network
Stability of the neural network
Error estimate using a global network update
Error estimates for local neural network updates
Numerical experiments
Configuration of the experimental setup
Neural network setup and optimization
Generation of training data
Optimization
Accuracy of the hybrid finite element neural network solver
...and 5 more sections

Key Result

Lemma 1

Let ${\cal N}$ be a $L({\cal N})$-Lipschitz continuous neural network, i.e., the neural network Lipschitz continuously maps the input data to the output data with Lipschitz constant $L({\cal N})$. Then it holds that where $I_h (f-f_i)\in V_h({\cal P})$ denotes the Lagrange interpolant of $f-f_i\in C(\bar{\Omega})$ on the patch ${\cal P}$.

Figures (10)

Figure 1: Illustration of the hybrid solver. The finite element solution $u_H$ is approximated on the coarse mesh $\Omega_H$ (in black). A patch mesh $\Omega_{\cal P}$ which has the same level or is even coarser than $\Omega_H$ (in red, here, one even coarser) combines elements from the coarse mesh. On each patch the solution is locally extracted. Together with the fine mesh right hand side information on $\Omega_h$ (shown in blue) it is the input of a neural network. The output $w_{\cal N}$ is the local correction towards an improved solution and is gathered on the fine global mesh $\Omega_h$.
Figure 2: Multilayer perceptron
Figure 3: Training and application domains $\Omega^{tr}$ and $\Omega$ as well as the corresponding meshes $\Omega^{tr}_h$ and $\Omega_h$ can differ. Both however must be split into the same kind of patches. A patch ${\cal P}$ and the extended patch $\tilde{{\cal P}}$ is marked in both domains in orange and blue, respectively.
Figure 4: Top: example source terms from ${\cal F}$. Bottom: corresponding solutions
Figure 5: Dependency of the prediction quality $\|u-u_{\cal N}\|$ (for the test data) on the size of the training data set, number of layers and number of neurons
...and 5 more figures

Theorems & Definitions (15)

Definition 1: Patch
Definition 2: Hybrid solution
Definition 3: Multilayer perceptron
Lemma 1: Network stability
proof
Remark 1: Lipschitz constant of MLP
Theorem 1: A priori finite element error for the single-patch solution
proof
Theorem 2: A priori finite element error for the hybrid solution based on local patches
Definition 4: Local problems
...and 5 more

Error analysis for hybrid finite element/neural network discretizations

TL;DR

Abstract

Error analysis for hybrid finite element/neural network discretizations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (15)