Table of Contents
Fetching ...

Multi-level Neural Networks for high-dimensional parametric obstacle problems

Martin Eigel, Cosmas Heiß, Janina E. Schütte

TL;DR

This work develops a multi-level CNN surrogate for high-dimensional parametric obstacle problems governed by an elliptic diffusion operator, framing the neural network through a multigrid perspective. It proves expressivity results showing the CNN can approximate a projected Richardson iteration with parameter counts that grow only polylogarithmically with accuracy, and that a multigrid-style V-cycle with monotone restriction can be emulated by the network. The approach decomposes the FE solution into coarse and fine-grid corrections and trains level-specific networks, achieving state-of-the-art accuracy on deterministic, stochastic, and rough-obstacle cases while maintaining stability as the parameter dimension grows. Practically, this yields efficient surrogates for repeated parametric solves in variational inequalities, with rigorous convergence insights and demonstrated empirical performance.

Abstract

A new method to solve computationally challenging (random) parametric obstacle problems is developed and analyzed, where the parameters can influence the related partial differential equation (PDE) and determine the position and surface structure of the obstacle. As governing equation, a stationary elliptic diffusion problem is assumed. The high-dimensional solution of the obstacle problem is approximated by a specifically constructed convolutional neural network (CNN). This novel algorithm is inspired by a finite element constrained multigrid algorithm to represent the parameter to solution map. This has two benefits: First, it allows for efficient practical computations since multi-level data is used as an explicit output of the NN thanks to an appropriate data preprocessing. This improves the efficacy of the training process and subsequently leads to small errors in the natural energy norm. Second, the comparison of the CNN to a multigrid algorithm provides means to carry out a complete a priori convergence and complexity analysis of the proposed NN architecture. Numerical experiments illustrate a state-of-the-art performance for this challenging problem.

Multi-level Neural Networks for high-dimensional parametric obstacle problems

TL;DR

This work develops a multi-level CNN surrogate for high-dimensional parametric obstacle problems governed by an elliptic diffusion operator, framing the neural network through a multigrid perspective. It proves expressivity results showing the CNN can approximate a projected Richardson iteration with parameter counts that grow only polylogarithmically with accuracy, and that a multigrid-style V-cycle with monotone restriction can be emulated by the network. The approach decomposes the FE solution into coarse and fine-grid corrections and trains level-specific networks, achieving state-of-the-art accuracy on deterministic, stochastic, and rough-obstacle cases while maintaining stability as the parameter dimension grows. Practically, this yields efficient surrogates for repeated parametric solves in variational inequalities, with rigorous convergence insights and demonstrated empirical performance.

Abstract

A new method to solve computationally challenging (random) parametric obstacle problems is developed and analyzed, where the parameters can influence the related partial differential equation (PDE) and determine the position and surface structure of the obstacle. As governing equation, a stationary elliptic diffusion problem is assumed. The high-dimensional solution of the obstacle problem is approximated by a specifically constructed convolutional neural network (CNN). This novel algorithm is inspired by a finite element constrained multigrid algorithm to represent the parameter to solution map. This has two benefits: First, it allows for efficient practical computations since multi-level data is used as an explicit output of the NN thanks to an appropriate data preprocessing. This improves the efficacy of the training process and subsequently leads to small errors in the natural energy norm. Second, the comparison of the CNN to a multigrid algorithm provides means to carry out a complete a priori convergence and complexity analysis of the proposed NN architecture. Numerical experiments illustrate a state-of-the-art performance for this challenging problem.

Paper Structure

This paper contains 12 sections, 6 theorems, 46 equations, 6 figures, 3 tables, 1 algorithm.

Key Result

Lemma 4.1

Let $\mathbf{y}\in\Gamma$ and $\kappa(\cdot,\mathbf{y})>0$ everywhere. Then for any nonzero $\mathbf{w}\in\mathbb{R}^{N}$ it holds that where $0<\omega_\mathbf{y}\leq \sigma_{\max}(A_\mathbf{y})^{-1}$.

Figures (6)

  • Figure 3.1: An example realization of a field $\kappa$, the respective solution to the obstacle problem $u$ and the corresponding contact set indicating where the solution is equal to the obstacle are shown for a constant obstacle $\varphi \equiv -0.036$. The solution is equal to the obstacle in the purple part in the last image while it satisfies the PDE on the yellow part of the domain. Since the contact set is unknown in advance, it is part of the solution for the given parameter field.
  • Figure 4.1: The first row images show the weighted restriction as defined in JMLR:v24:23-0421. A visualization of the restriction operator defined in \ref{['definition: Rl']} is depicted in the second row images. In both rows the first image illustrates an obstacle in black and an initial guess for the solution in blue. The second images show the restricted obstacle together with a coarse grid solution in green. The last images depicts the prolongated coarse grid solution together with the true obstacle. It can be seen that taking a maximum, when restricting the obstacle, is critical for the coarse grid solution to still be above or equal to the true obstacle on the finer grid. The dependence on a level $\ell$ is suppressed in the notation.
  • Figure 6.1: The solution to the obstacle problem in $V_4$ in the first row on the left-hand side can be decomposed into corrections in $V_1,V_2,V_3,V_4$ of decreasing values on different grids as seen in the first row. The sum of the corrections equals the full solution. The FE coefficients of the solution in $\mathbb{R}^{N_4}$ and the corrections in $\mathbb{R}^{N_1},\mathbb{R}^{N_2},\mathbb{R}^{N_3},\mathbb{R}^{N_4}$ are visualized in images underneath each function.
  • Figure 6.2: The first image depicts a realization of the rough surface model Persson_2005. In the second and third images, the corresponding solution of the obstacle problem and the resulting contact set are shown, where the solution is equal to the obstacle. The contact set is colored in purple.
  • Figure 6.3: Error plots for the stochastic constant obstacle problem with parameter dimension $p=11$ are shown for a trained CNN. Errors of the CNN output compared to a reference solution are plotted in blue and errors of the finite element solution on the same grid as the CNN output to the reference solution are plotted in orange. A line indicates the mean of the relative errors over a test set and the area visualizes its variance. The left plot shows $H^1$ errors and the right plot shows $L^2$ errors.
  • ...and 1 more figures

Theorems & Definitions (16)

  • Lemma 4.1: generalization of schutte2024multilevelcnnsparametricpdes or braess
  • Lemma 4.2
  • proof
  • Definition 4.3: Prolongation matrices
  • Definition 4.4: Monotone restriction operator
  • Theorem 5.2: CNN for the projected Richardson iteration
  • Corollary 5.3: CNN for parametric obstacle problem
  • Remark 5.4: CNN for multigrid algorithm
  • proof : Proof of \ref{['theorem:omega>gamma']}
  • Lemma A.1: Maxima approximation
  • ...and 6 more