MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids

Yan Xie; Minrui Lv; Chensong Zhang

MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids

Yan Xie, Minrui Lv, Chensong Zhang

TL;DR

The paper presents MGCNN, a learnable multigrid solver for sparse linear systems from discretized PDEs on structured grids, designed to generalize across RHS, coefficients, and grid size. It enforces three principles: a multilevel hierarchy, linearity in the RHS, and weight sharing across levels, implemented via a two-phase network (setup and solve) with RCNN/ResNet components and no nonlinear activation in the solve phase. Training is unsupervised, minimizing the residual $L = \frac{1}{N}\sum_i ||rhs_i - A_i sol_i||^2$ with RHS drawn from white noise, enabling offline training that transfers to unseen problems; on convection–diffusion with heterogeneous coefficients, the method achieves $3$–$8\times$ speedups over classical GMG up to $4095\times4095$ grids. The approach reduces reliance on expert algorithm design and holds promise for extension to unstructured grids and nonlinear PDEs, though further work is needed to address space-invariance limitations and broader coefficient distributions.

Abstract

This paper presents a learnable solver tailored to iteratively solve sparse linear systems from discretized partial differential equations (PDEs). Unlike traditional approaches relying on specialized expertise, our solver streamlines the algorithm design process for a class of PDEs through training, which requires only training data of coefficient distributions. The proposed method is anchored by three core principles: (1) a multilevel hierarchy to promote rapid convergence, (2) adherence to linearity concerning the right-hand-side of equations, and (3) weights sharing across different levels to facilitate adaptability to various problem sizes. Built on these foundational principles and considering the similar computation pattern of the convolutional neural network (CNN) as multigrid components, we introduce a network adept at solving linear systems from PDEs with heterogeneous coefficients, discretized on structured grids. Notably, our proposed solver possesses the ability to generalize over right-hand-side terms, PDE coefficients, and grid sizes, thereby ensuring its training is purely offline. To evaluate its effectiveness, we train the solver on convection-diffusion equations featuring heterogeneous diffusion coefficients. The solver exhibits swift convergence to high accuracy over a range of grid sizes, extending from $31 \times 31$ to $4095 \times 4095$. Remarkably, our method outperforms the classical Geometric Multigrid (GMG) solver, demonstrating a speedup of approximately 3 to 8 times. Furthermore, our numerical investigation into the solver's capacity to generalize to untrained coefficient distributions reveals promising outcomes.

MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids

TL;DR

with RHS drawn from white noise, enabling offline training that transfers to unseen problems; on convection–diffusion with heterogeneous coefficients, the method achieves

–

speedups over classical GMG up to

grids. The approach reduces reliance on expert algorithm design and holds promise for extension to unstructured grids and nonlinear PDEs, though further work is needed to address space-invariance limitations and broader coefficient distributions.

Abstract

. Remarkably, our method outperforms the classical Geometric Multigrid (GMG) solver, demonstrating a speedup of approximately 3 to 8 times. Furthermore, our numerical investigation into the solver's capacity to generalize to untrained coefficient distributions reveals promising outcomes.

Paper Structure (28 sections, 17 equations, 11 figures, 6 tables, 4 algorithms)

This paper contains 28 sections, 17 equations, 11 figures, 6 tables, 4 algorithms.

Introduction
Background
Related Works
Our Contributions
Paper Organization
Preliminaries and Notations
Solver as Decoder
Iterative Methods
Multigrid Algorithm
Neural Network
Discretization of PDEs on Structured Grids
Methodology
Design Principles and Network Phases
Setup Phase
Solve Phase
...and 13 more sections

Figures (11)

Figure 1: Solver as Decoder
Figure 2: Stacked ResNets
Figure 3: Stacked ResNets with twogrid hierarchy
Figure 4: The 3-level setup network. The inner network (in pink) is used to restrict the problem coefficient tensor (in magenta) from fine to coarse grid. The outer network (in blue) maps these coefficient on each level into setup tensors needed for the solve phase. The right-banded blue boxes indicate the use of the $tanh$ activation function.
Figure 5: The 3-level solve network. The top part displays the setup output tensors (in blue). The bottom (in purple) represents the solve phase, where the right-hand-side and the solver output are updated within a user-defined iterative method.
...and 6 more figures

Theorems & Definitions (15)

Remark 2.1: State update strategy
Remark 2.2: Non-divergence form vs divergence form
Remark 3.1: No weight sharing when number of levels fixed
Remark 3.2: Matrix coefficients as input
Remark 3.3: Setup phase structure
Remark 3.4: Black-box solver
Remark 3.5: Symmetric problem
Remark 3.6: Non-heterogeneous problem
Remark 3.7: Other multigrid patterns
Remark 3.8: Vector-type PDEs
...and 5 more

MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids

TL;DR

Abstract

MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids

Authors

TL;DR

Abstract

Table of Contents

Figures (11)

Theorems & Definitions (15)