Adaptive Finite Element Interpolated Neural Networks

Santiago Badia; Wei Li; Alberto F. Martín

Adaptive Finite Element Interpolated Neural Networks

Santiago Badia, Wei Li, Alberto F. Martín

TL;DR

This work tackles PDE approximation in the presence of sharp gradients and singularities by coupling neural networks with an $h$-adaptive finite element framework. The authors interpolate neural networks onto $H^1$-conforming FE spaces and train using a well-posed, preconditioned dual-norm residual loss, while automatically refining the mesh via a posteriori error indicators. They establish a priori error bounds that depend on the NN's expressiveness relative to the interpolation mesh and demonstrate the method on 2D and 3D forward/inverse problems, including singularities, with a preconditioning strategy that enhances training robustness and convergence. The approach yields accurate, scalable PDE surrogates capable of capturing localized features in complex geometries, with potential extensions to inverse problems and higher-order adaptivity.

Abstract

The use of neural networks to approximate partial differential equations (PDEs) has gained significant attention in recent years. However, the approximation of PDEs with localised phenomena, e.g., sharp gradients and singularities, remains a challenge, due to ill-defined cost functions in terms of pointwise residual sampling or poor numerical integration. In this work, we introduce $h$-adaptive finite element interpolated neural networks. The method relies on the interpolation of a neural network onto a finite element space that is gradually adapted to the solution during the training process to equidistribute a posteriori error indicator. The use of adaptive interpolation is essential in preserving the non-linear approximation capabilities of the neural networks to effectively tackle problems with localised features. The training relies on a gradient-based optimisation of a loss function based on the (dual) norm of the finite element residual of the interpolated neural network. Automatic mesh adaptation (i.e., refinement and coarsening) is performed based on a posteriori error indicators till a certain level of accuracy is reached. The proposed methodology can be applied to indefinite and nonsymmetric problems. We carry out a detailed numerical analysis of the scheme and prove several a priori error estimates, depending on the expressiveness of the neural network compared to the interpolation mesh. Our numerical experiments confirm the effectiveness of the method in capturing sharp gradients and singularities for forward and inverse PDE problems, both in 2D and 3D scenarios. We also show that the proposed preconditioning strategy (i.e., using a dual residual norm of the residual as a cost function) enhances training robustness and accelerates convergence.

Adaptive Finite Element Interpolated Neural Networks

TL;DR

This work tackles PDE approximation in the presence of sharp gradients and singularities by coupling neural networks with an

-adaptive finite element framework. The authors interpolate neural networks onto

-conforming FE spaces and train using a well-posed, preconditioned dual-norm residual loss, while automatically refining the mesh via a posteriori error indicators. They establish a priori error bounds that depend on the NN's expressiveness relative to the interpolation mesh and demonstrate the method on 2D and 3D forward/inverse problems, including singularities, with a preconditioning strategy that enhances training robustness and convergence. The approach yields accurate, scalable PDE surrogates capable of capturing localized features in complex geometries, with potential extensions to inverse problems and higher-order adaptivity.

Abstract

-adaptive finite element interpolated neural networks. The method relies on the interpolation of a neural network onto a finite element space that is gradually adapted to the solution during the training process to equidistribute a posteriori error indicator. The use of adaptive interpolation is essential in preserving the non-linear approximation capabilities of the neural networks to effectively tackle problems with localised features. The training relies on a gradient-based optimisation of a loss function based on the (dual) norm of the finite element residual of the interpolated neural network. Automatic mesh adaptation (i.e., refinement and coarsening) is performed based on a posteriori error indicators till a certain level of accuracy is reached. The proposed methodology can be applied to indefinite and nonsymmetric problems. We carry out a detailed numerical analysis of the scheme and prove several a priori error estimates, depending on the expressiveness of the neural network compared to the interpolation mesh. Our numerical experiments confirm the effectiveness of the method in capturing sharp gradients and singularities for forward and inverse PDE problems, both in 2D and 3D scenarios. We also show that the proposed preconditioning strategy (i.e., using a dual residual norm of the residual as a cost function) enhances training robustness and accelerates convergence.

Paper Structure (27 sections, 8 theorems, 42 equations, 12 figures, 1 table, 1 algorithm)

This paper contains 27 sections, 8 theorems, 42 equations, 12 figures, 1 table, 1 algorithm.

Introduction
Methodology
Continuous problem
Finite element method
Neural networks
Finite element interpolated neural networks
$h$-Adaptive finite element interpolated neural networks
Gradient-conforming discretisation
Error indicators
Numerical analysis
Error analysis
Quasi-emulation
Generalisation error and equivalence classes
Implementation
Automatic mesh adaptation using forest-of-octrees
...and 12 more sections

Key Result

Theorem 3.1

Let us consider a pair of fe spaces $U_h$ and $V_h$ that satisfy the inf-sup condition (eq:discrete-infsup). The discrete quadratic minimisation problem is well-posed. The problem can be re-stated as: find $u_h \in U_h$ and $r_h = r_h(u_h) \in V_h$ such that The solution satisfies the following a priori estimates:

Figures (12)

Figure 1: The true solution, initial mesh, and $h$-adaptive fem final mesh for the 2D arc wavefront problem. The final mesh is obtained through $h$-adaptive fem with 7 iterative adaptation steps using the real fem error as the error indicator.
Figure 2: Convergence of $u^{id}$ in $L^2$ and $H^1$ errors for the 2D arc wavefront problem, using different error indicators. The top row shows $L^2$ errors and the bottom row represents $H^1$ errors. The first column displays errors of the interpolated nn and the second column corresponds to errors of the nn themselves. The line denotes the median, and the band represents the range from the minimum to the 90th percentile across 20 independent runs. The red dashed line illustrates the error of the $h$-adaptive fem solution, using the real fem error as the error indicator.
Figure 3: Comparison of final meshes obtained by training of the $h$-adaptive feinn for the 2D arc wavefront problem. The meshes result from 7 mesh adaptation steps using Kelly, network, and real error indicators, respectively.
Figure 4: Convergence history of $u^{id}$ in $L^2$ and $H^1$ errors for the 2D arc wavefront problem, using different norms in the preconditioned loss during training. Kelly error indicator is used for mesh adaptation. The top row shows $L^2$ errors and the bottom row represents $H^1$ errors. The first column displays errors of the interpolated nn and the second column corresponds to errors of the nn themselves.
Figure 5: Convergence of $u^{id}$ in $L^2$ and $H^1$ errors for the 2D arc wavefront problem, using different preconditioners during training. Kelly error indicator is used for mesh adaptation. Refer to the caption of Fig. \ref{['fig:arc_indicator_study_error_convergence']} for details on the information being displayed in this figure.
...and 7 more figures

Theorems & Definitions (23)

Remark 2.1
Remark 2.2
Remark 2.3
Theorem 3.1
proof
Remark 3.2
Proposition 3.3
proof
Proposition 3.4
proof
...and 13 more

Adaptive Finite Element Interpolated Neural Networks

TL;DR

Abstract

Adaptive Finite Element Interpolated Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (23)