Table of Contents
Fetching ...

MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids

Yan Xie, Minrui Lv, Chensong Zhang

TL;DR

The paper presents MGCNN, a learnable multigrid solver for sparse linear systems from discretized PDEs on structured grids, designed to generalize across RHS, coefficients, and grid size. It enforces three principles: a multilevel hierarchy, linearity in the RHS, and weight sharing across levels, implemented via a two-phase network (setup and solve) with RCNN/ResNet components and no nonlinear activation in the solve phase. Training is unsupervised, minimizing the residual $L = \frac{1}{N}\sum_i ||rhs_i - A_i sol_i||^2$ with RHS drawn from white noise, enabling offline training that transfers to unseen problems; on convection–diffusion with heterogeneous coefficients, the method achieves $3$–$8\times$ speedups over classical GMG up to $4095\times4095$ grids. The approach reduces reliance on expert algorithm design and holds promise for extension to unstructured grids and nonlinear PDEs, though further work is needed to address space-invariance limitations and broader coefficient distributions.

Abstract

This paper presents a learnable solver tailored to iteratively solve sparse linear systems from discretized partial differential equations (PDEs). Unlike traditional approaches relying on specialized expertise, our solver streamlines the algorithm design process for a class of PDEs through training, which requires only training data of coefficient distributions. The proposed method is anchored by three core principles: (1) a multilevel hierarchy to promote rapid convergence, (2) adherence to linearity concerning the right-hand-side of equations, and (3) weights sharing across different levels to facilitate adaptability to various problem sizes. Built on these foundational principles and considering the similar computation pattern of the convolutional neural network (CNN) as multigrid components, we introduce a network adept at solving linear systems from PDEs with heterogeneous coefficients, discretized on structured grids. Notably, our proposed solver possesses the ability to generalize over right-hand-side terms, PDE coefficients, and grid sizes, thereby ensuring its training is purely offline. To evaluate its effectiveness, we train the solver on convection-diffusion equations featuring heterogeneous diffusion coefficients. The solver exhibits swift convergence to high accuracy over a range of grid sizes, extending from $31 \times 31$ to $4095 \times 4095$. Remarkably, our method outperforms the classical Geometric Multigrid (GMG) solver, demonstrating a speedup of approximately 3 to 8 times. Furthermore, our numerical investigation into the solver's capacity to generalize to untrained coefficient distributions reveals promising outcomes.

MGCNN: a learnable multigrid solver for sparse linear systems from PDEs on structured grids

TL;DR

The paper presents MGCNN, a learnable multigrid solver for sparse linear systems from discretized PDEs on structured grids, designed to generalize across RHS, coefficients, and grid size. It enforces three principles: a multilevel hierarchy, linearity in the RHS, and weight sharing across levels, implemented via a two-phase network (setup and solve) with RCNN/ResNet components and no nonlinear activation in the solve phase. Training is unsupervised, minimizing the residual with RHS drawn from white noise, enabling offline training that transfers to unseen problems; on convection–diffusion with heterogeneous coefficients, the method achieves speedups over classical GMG up to grids. The approach reduces reliance on expert algorithm design and holds promise for extension to unstructured grids and nonlinear PDEs, though further work is needed to address space-invariance limitations and broader coefficient distributions.

Abstract

This paper presents a learnable solver tailored to iteratively solve sparse linear systems from discretized partial differential equations (PDEs). Unlike traditional approaches relying on specialized expertise, our solver streamlines the algorithm design process for a class of PDEs through training, which requires only training data of coefficient distributions. The proposed method is anchored by three core principles: (1) a multilevel hierarchy to promote rapid convergence, (2) adherence to linearity concerning the right-hand-side of equations, and (3) weights sharing across different levels to facilitate adaptability to various problem sizes. Built on these foundational principles and considering the similar computation pattern of the convolutional neural network (CNN) as multigrid components, we introduce a network adept at solving linear systems from PDEs with heterogeneous coefficients, discretized on structured grids. Notably, our proposed solver possesses the ability to generalize over right-hand-side terms, PDE coefficients, and grid sizes, thereby ensuring its training is purely offline. To evaluate its effectiveness, we train the solver on convection-diffusion equations featuring heterogeneous diffusion coefficients. The solver exhibits swift convergence to high accuracy over a range of grid sizes, extending from to . Remarkably, our method outperforms the classical Geometric Multigrid (GMG) solver, demonstrating a speedup of approximately 3 to 8 times. Furthermore, our numerical investigation into the solver's capacity to generalize to untrained coefficient distributions reveals promising outcomes.
Paper Structure (28 sections, 17 equations, 11 figures, 6 tables, 4 algorithms)

This paper contains 28 sections, 17 equations, 11 figures, 6 tables, 4 algorithms.

Figures (11)

  • Figure 1: Solver as Decoder
  • Figure 2: Stacked ResNets
  • Figure 3: Stacked ResNets with twogrid hierarchy
  • Figure 4: The 3-level setup network. The inner network (in pink) is used to restrict the problem coefficient tensor (in magenta) from fine to coarse grid. The outer network (in blue) maps these coefficient on each level into setup tensors needed for the solve phase. The right-banded blue boxes indicate the use of the $tanh$ activation function.
  • Figure 5: The 3-level solve network. The top part displays the setup output tensors (in blue). The bottom (in purple) represents the solve phase, where the right-hand-side and the solver output are updated within a user-defined iterative method.
  • ...and 6 more figures

Theorems & Definitions (15)

  • Remark 2.1: State update strategy
  • Remark 2.2: Non-divergence form vs divergence form
  • Remark 3.1: No weight sharing when number of levels fixed
  • Remark 3.2: Matrix coefficients as input
  • Remark 3.3: Setup phase structure
  • Remark 3.4: Black-box solver
  • Remark 3.5: Symmetric problem
  • Remark 3.6: Non-heterogeneous problem
  • Remark 3.7: Other multigrid patterns
  • Remark 3.8: Vector-type PDEs
  • ...and 5 more