Table of Contents
Fetching ...

Multilevel CNNs for Parametric PDEs based on Adaptive Finite Elements

Janina Enrica Schütte, Martin Eigel

TL;DR

The paper develops a multilevel CNN framework that emulates an adaptive finite element method (AFEM) for parametric PDEs, using adaptively refined FE data and a posteriori error estimators as inputs. By representing FE corrections, estimators, and refinement masks as images, it leverages submanifold sparse CNNs to perform level-wise multigrid-like updates with local corrections, achieving provable approximation guarantees of the AFEM steps. The authors derive quantitative complexity bounds showing CNNs can approximate the multigrid solver, estimator, and the full AFEM with logarithmic dependence on the inverse accuracy and linear dependence on the number of refinement levels and iterations. Numerical experiments on a cookie-type parametric diffusion problem demonstrate the approach conceptually, with CNNs closely tracking adaptive solver performance and estimator-driven marking. The work advances efficient, data-driven surrogates for high-dimensional parametric PDEs by integrating adaptive discretization, error estimation, and localized neural computations.

Abstract

A neural network architecture is presented that exploits the multilevel properties of high-dimensional parameter-dependent partial differential equations, enabling an efficient approximation of parameter-to-solution maps, rivaling best-in-class methods such as low-rank tensor regression in terms of accuracy and complexity. The neural network is trained with data on adaptively refined finite element meshes, thus reducing data complexity significantly. Error control is achieved by using a reliable finite element a posteriori error estimator, which is also provided as input to the neural network. The proposed U-Net architecture with CNN layers mimics a classical finite element multigrid algorithm. It can be shown that the CNN efficiently approximates all operations required by the solver, including the evaluation of the residual-based error estimator. In the CNN, a culling mask set-up according to the local corrections due to refinement on each mesh level reduces the overall complexity, allowing the network optimization with localized fine-scale finite element data. A complete convergence and complexity analysis is carried out for the adaptive multilevel scheme, which differs in several aspects from previous non-adaptive multilevel CNN. Moreover, numerical experiments with common benchmark problems from Uncertainty Quantification illustrate the practical performance of the architecture.

Multilevel CNNs for Parametric PDEs based on Adaptive Finite Elements

TL;DR

The paper develops a multilevel CNN framework that emulates an adaptive finite element method (AFEM) for parametric PDEs, using adaptively refined FE data and a posteriori error estimators as inputs. By representing FE corrections, estimators, and refinement masks as images, it leverages submanifold sparse CNNs to perform level-wise multigrid-like updates with local corrections, achieving provable approximation guarantees of the AFEM steps. The authors derive quantitative complexity bounds showing CNNs can approximate the multigrid solver, estimator, and the full AFEM with logarithmic dependence on the inverse accuracy and linear dependence on the number of refinement levels and iterations. Numerical experiments on a cookie-type parametric diffusion problem demonstrate the approach conceptually, with CNNs closely tracking adaptive solver performance and estimator-driven marking. The work advances efficient, data-driven surrogates for high-dimensional parametric PDEs by integrating adaptive discretization, error estimation, and localized neural computations.

Abstract

A neural network architecture is presented that exploits the multilevel properties of high-dimensional parameter-dependent partial differential equations, enabling an efficient approximation of parameter-to-solution maps, rivaling best-in-class methods such as low-rank tensor regression in terms of accuracy and complexity. The neural network is trained with data on adaptively refined finite element meshes, thus reducing data complexity significantly. Error control is achieved by using a reliable finite element a posteriori error estimator, which is also provided as input to the neural network. The proposed U-Net architecture with CNN layers mimics a classical finite element multigrid algorithm. It can be shown that the CNN efficiently approximates all operations required by the solver, including the evaluation of the residual-based error estimator. In the CNN, a culling mask set-up according to the local corrections due to refinement on each mesh level reduces the overall complexity, allowing the network optimization with localized fine-scale finite element data. A complete convergence and complexity analysis is carried out for the adaptive multilevel scheme, which differs in several aspects from previous non-adaptive multilevel CNN. Moreover, numerical experiments with common benchmark problems from Uncertainty Quantification illustrate the practical performance of the architecture.
Paper Structure (33 sections, 18 theorems, 98 equations, 15 figures, 4 algorithms)

This paper contains 33 sections, 18 theorems, 98 equations, 15 figures, 4 algorithms.

Key Result

Theorem 1.1

Assume that $\kappa$ is uniformly bounded from below and above. Let $\varepsilon>0$ and $K, L\in\mathbb{N}$ be the number of iterations of the derived AFEM and the maximal refinements of each triangle, respectively. Consider a threshold marking strategy. Then there exists a CNN $\Psi$ such that the where $\mathcal{C}$ maps the finite element coefficients to the corresponding function.

Figures (15)

  • Figure 1.1: The first row depicts the parameter $\kappa$ to solution $u$ map for a realization of the parameter vector $\mathbf{y}\in\Gamma$ for \ref{['eq: darcy linear equation system']}. In the second row, the multigrid decomposition of the solution into a coarse grid function $v_1$ and finer grid corrections $v_2,v_3$ is visualized.
  • Figure 1.2: In the top row the support of the considered nodal basis functions on different levels is visualized. Uniformly refined meshes as used in cosi are shown in the top row, locally refined meshes as used in this work in the bottom row. The local refinement is realized by using a subset of the nodes in the uniformly refined meshes.
  • Figure 2.1: Two iterations of the adaptive finite element method on a unit square are depicted, where the first image on the left is a visualization of a possible parameter field $\kappa(\cdot,\mathbf{y})$. In the rest of the first row, the first mesh, solution, local error estimator and marker are depicted. The second row shows these steps for a locally refined mesh.
  • Figure 2.2: The two plots show the advantage of the $\mathrm{AFEM}$ in terms of degrees of freedom (FE coefficients) compared to solutions on uniformly refined meshes. Here, the mean and variance of the relative $H^1$ (left) and $L^2$ (right) errors of $100$ samples of the problem described in \ref{['section: numerics']} are plotted for the Dörfler marking with $\theta=0.1$.
  • Figure 3.1: Depicted is the decomposition of a continuous function $v\in V_h$ into coarse grid parts and fine grid corrections on uniformly refined grids. Each function on a uniformly refined grid can be represented by an image, where one pixel corresponds to the value of one node. For local corrections the images are sparse.
  • ...and 10 more figures

Theorems & Definitions (42)

  • Theorem 1.1: CNNs can approximate adaptive finite element solvers
  • Definition 2.1: Jump & error estimator
  • Definition 2.2: Dörfler marking
  • Definition 2.3: Threshold marking
  • Definition 3.1: Prolongation & weighted restriction
  • Theorem 3.1: Levelwise calculation of $Q_k A_\mathbf{y} \mathbf{u}$
  • proof
  • Theorem 3.2: Convergence of the LLMG
  • proof
  • Remark 3.1
  • ...and 32 more