Table of Contents
Fetching ...

MgNO: Efficient Parameterization of Linear Operators via Multigrid

Juncai He, Xinliang Liu, Jinchao Xu

TL;DR

This work introduces MgNO, utilizing multigrid structures to parameterize these linear operators between neurons, and demonstrates the efficiency and accuracy of the method with consistently state-of-the-art performance on different types of partial differential equations (PDEs).

Abstract

In this work, we propose a concise neural operator architecture for operator learning. Drawing an analogy with a conventional fully connected neural network, we define the neural operator as follows: the output of the $i$-th neuron in a nonlinear operator layer is defined by $O_i(u) = σ\left( \sum_j W_{ij} u + B_{ij}\right)$. Here, $ W_{ij}$ denotes the bounded linear operator connecting $j$-th input neuron to $i$-th output neuron, and the bias $ B_{ij}$ takes the form of a function rather than a scalar. Given its new universal approximation property, the efficient parameterization of the bounded linear operators between two neurons (Banach spaces) plays a critical role. As a result, we introduce MgNO, utilizing multigrid structures to parameterize these linear operators between neurons. This approach offers both mathematical rigor and practical expressivity. Additionally, MgNO obviates the need for conventional lifting and projecting operators typically required in previous neural operators. Moreover, it seamlessly accommodates diverse boundary conditions. Our empirical observations reveal that MgNO exhibits superior ease of training compared to other CNN-based models, while also displaying a reduced susceptibility to overfitting when contrasted with spectral-type neural operators. We demonstrate the efficiency and accuracy of our method with consistently state-of-the-art performance on different types of partial differential equations (PDEs).

MgNO: Efficient Parameterization of Linear Operators via Multigrid

TL;DR

This work introduces MgNO, utilizing multigrid structures to parameterize these linear operators between neurons, and demonstrates the efficiency and accuracy of the method with consistently state-of-the-art performance on different types of partial differential equations (PDEs).

Abstract

In this work, we propose a concise neural operator architecture for operator learning. Drawing an analogy with a conventional fully connected neural network, we define the neural operator as follows: the output of the -th neuron in a nonlinear operator layer is defined by . Here, denotes the bounded linear operator connecting -th input neuron to -th output neuron, and the bias takes the form of a function rather than a scalar. Given its new universal approximation property, the efficient parameterization of the bounded linear operators between two neurons (Banach spaces) plays a critical role. As a result, we introduce MgNO, utilizing multigrid structures to parameterize these linear operators between neurons. This approach offers both mathematical rigor and practical expressivity. Additionally, MgNO obviates the need for conventional lifting and projecting operators typically required in previous neural operators. Moreover, it seamlessly accommodates diverse boundary conditions. Our empirical observations reveal that MgNO exhibits superior ease of training compared to other CNN-based models, while also displaying a reduced susceptibility to overfitting when contrasted with spectral-type neural operators. We demonstrate the efficiency and accuracy of our method with consistently state-of-the-art performance on different types of partial differential equations (PDEs).
Paper Structure (38 sections, 1 theorem, 17 equations, 8 figures, 8 tables, 1 algorithm)

This paper contains 38 sections, 1 theorem, 17 equations, 8 figures, 8 tables, 1 algorithm.

Key Result

Theorem 3.1

Let $\mathcal{X}= H^s(\Omega)$ and $\mathcal{Y} = H^{s'}(\Omega)$ for some $s,s' \ge 1$, and $\sigma \in C(\mathbb R)$ is non-polynomial, for any continuous operator $\mathcal{O}^*: \mathcal{X}\mapsto \mathcal{Y}$, compact set $\mathcal{C} \subset \mathcal{X}$ and $\epsilon > 0$, there is $n$ such where $\Xi_n$ denote the shallow networks defined in equation eq:snndef with $n$ neurons.

Figures (8)

  • Figure 1: Overview of $\mathcal{W}_{Mg}$ using a multi-channel V-cycle multigrid framework.
  • Figure 2: Qualitative comparisons on Darcy rough benchmark. Top: coefficient $a$, ground truth $u$, and predictions; bottom: the corresponding prediction error map for each model in the same color scale.
  • Figure 3: Comparison of training dynamics between MgNO, FNO and UNO. The x-axis represents the number of epochs, and the y-axis is the error in the log scale. We present both the $L^2$ and $H^1$ training and testing accuracy (errors). For full comparisons, please refer \ref{['set:darcy:training dynamics']}
  • Figure 4: The error, quantified on a logarithmic scale, numerically demonstrates the approximation rate, which is approximately $1-\frac{1}{c}\approx0.1$, as outlined in equation \ref{['eqn:mg_converge']}.
  • Figure 5: (a) multiscale trigonometric coefficient, (b) reference solution.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Theorem 3.1
  • Remark 5.1