Table of Contents
Fetching ...

Adaptive Multilevel Neural Networks for Parametric PDEs with Error Estimation

Janina E. Schütte, Martin Eigel

Abstract

To solve high-dimensional parameter-dependent partial differential equations (pPDEs), a neural network architecture is presented. It is constructed to map parameters of the model data to corresponding finite element solutions. To improve training efficiency and to enable control of the approximation error, the network mimics an adaptive finite element method (AFEM). It outputs a coarse grid solution and a series of corrections as produced in an AFEM, allowing a tracking of the error decay over successive layers of the network. The observed errors are measured by a reliable residual based a posteriori error estimator, enabling the reduction to only few parameters for the approximation in the output of the network. This leads to a problem adapted representation of the solution on locally refined grids. Furthermore, each solution of the AFEM is discretized in a hierarchical basis. For the architecture, convolutional neural networks (CNNs) are chosen. The hierarchical basis then allows to handle sparse images for finely discretized meshes. Additionally, as corrections on finer levels decrease in amplitude, i.e., importance for the overall approximation, the accuracy of the network approximation is allowed to decrease successively. This can either be incorporated in the number of generated high fidelity samples used for training or the size of the network components responsible for the fine grid outputs. The architecture is described and preliminary numerical examples are presented.

Adaptive Multilevel Neural Networks for Parametric PDEs with Error Estimation

Abstract

To solve high-dimensional parameter-dependent partial differential equations (pPDEs), a neural network architecture is presented. It is constructed to map parameters of the model data to corresponding finite element solutions. To improve training efficiency and to enable control of the approximation error, the network mimics an adaptive finite element method (AFEM). It outputs a coarse grid solution and a series of corrections as produced in an AFEM, allowing a tracking of the error decay over successive layers of the network. The observed errors are measured by a reliable residual based a posteriori error estimator, enabling the reduction to only few parameters for the approximation in the output of the network. This leads to a problem adapted representation of the solution on locally refined grids. Furthermore, each solution of the AFEM is discretized in a hierarchical basis. For the architecture, convolutional neural networks (CNNs) are chosen. The hierarchical basis then allows to handle sparse images for finely discretized meshes. Additionally, as corrections on finer levels decrease in amplitude, i.e., importance for the overall approximation, the accuracy of the network approximation is allowed to decrease successively. This can either be incorporated in the number of generated high fidelity samples used for training or the size of the network components responsible for the fine grid outputs. The architecture is described and preliminary numerical examples are presented.
Paper Structure (11 sections, 1 theorem, 16 equations, 4 figures, 1 table)

This paper contains 11 sections, 1 theorem, 16 equations, 4 figures, 1 table.

Key Result

Theorem 2.1

Let $\mathbf{u}_\mathbf{y}$ be the FE coefficients of the Galerkin projection of the solution of eq: problem onto the piecewise affine FE space over a uniform square mesh with triangle size $h$. Assume that the parameter fields are uniformly bounded over the parameters. Then for any $\varepsilon>0$

Figures (4)

  • Figure 1.1: The first row depicts the parameter $\kappa$ to solution $u$ map for a realization of $\mathbf{y}\in\Gamma$ for the parametric stationary diffusion PDE. In the second row, the applied multigrid decomposition of the solutions into a coarse grid function $v^1$ and finer grid corrections $v^2,v^3$ is visualized.
  • Figure 2.1: The CNN architecture for three levels is depicted.
  • Figure D.1: Plotted are the errors over the number of coefficients of the representation of the solution computed on uniform meshes and on locally refined meshes.
  • Figure D.2: The decay of the errors over the steps of the CNN are shown.

Theorems & Definitions (4)

  • Theorem 2.1: cosi
  • Remark 2.1
  • Definition 3.1: Cookie problem with $2$ inclusions
  • Definition B.1: Jump & error estimator