Table of Contents
Fetching ...

Matrix-free GPU-accelerated saddle-point solvers for high-order problems in $H(\mathrm{div})$

Will Pazner, Tzanio Kolev, Panayot Vassilevski

Abstract

This work describes the development of matrix-free GPU-accelerated solvers for high-order finite element problems in $H(\mathrm{div})$. The solvers are applicable to grad-div and Darcy problems in saddle-point formulation, and have applications in radiation diffusion and porous media flow problems, among others. Using the interpolation-histopolation basis (cf. SIAM J. Sci. Comput., 45 (2023), A675-A702, arXiv:2203.02465), efficient matrix-free preconditioners can be constructed for the $(1,1)$-block and Schur complement of the block system. With these approximations, block-preconditioned MINRES converges in a number of iterations that is independent of the mesh size and polynomial degree. The approximate Schur complement takes the form of an M-matrix graph Laplacian, and therefore can be well-preconditioned by highly scalable algebraic multigrid methods. High-performance GPU-accelerated algorithms for all components of the solution algorithm are developed, discussed, and benchmarked. Numerical results are presented on a number of challenging test cases, including the "crooked pipe" grad-div problem, the SPE10 reservoir modeling benchmark problem, and a nonlinear radiation diffusion test case.

Matrix-free GPU-accelerated saddle-point solvers for high-order problems in $H(\mathrm{div})$

Abstract

This work describes the development of matrix-free GPU-accelerated solvers for high-order finite element problems in . The solvers are applicable to grad-div and Darcy problems in saddle-point formulation, and have applications in radiation diffusion and porous media flow problems, among others. Using the interpolation-histopolation basis (cf. SIAM J. Sci. Comput., 45 (2023), A675-A702, arXiv:2203.02465), efficient matrix-free preconditioners can be constructed for the -block and Schur complement of the block system. With these approximations, block-preconditioned MINRES converges in a number of iterations that is independent of the mesh size and polynomial degree. The approximate Schur complement takes the form of an M-matrix graph Laplacian, and therefore can be well-preconditioned by highly scalable algebraic multigrid methods. High-performance GPU-accelerated algorithms for all components of the solution algorithm are developed, discussed, and benchmarked. Numerical results are presented on a number of challenging test cases, including the "crooked pipe" grad-div problem, the SPE10 reservoir modeling benchmark problem, and a nonlinear radiation diffusion test case.
Paper Structure (17 sections, 5 theorems, 48 equations, 10 figures, 4 tables, 1 algorithm)

This paper contains 17 sections, 5 theorems, 48 equations, 10 figures, 4 tables, 1 algorithm.

Key Result

Proposition 1

Let $\mathcal{B}$ be the block-diagonal preconditioner where $\mathsf{S} = \mathsf{C} + \mathsf{D} \mathsf{A}^{-1} \mathsf{D}^\mathsf{T}$ is the (negative) Schur complement of $\mathcal{A}$ with respect to the $(1,1)$-block. This preconditioner is optimal in the sense that

Figures (10)

  • Figure 1: Left: high-order ($p=9$) hexahedral element with 10 Gauss--Lobatto nodes in each dimension. Right: subelement mesh with vertices at the Gauss--Lobatto points.
  • Figure 2: Condition number of the $L^2$ mass matrix preconditioned by diagonal scaling for different choices of basis. Left: 2D case on a skewed quadrilateral. Right: 3D case on a skewed hexahedron.
  • Figure 3: Computed condition numbers of preconditioner Schur complement. The transformed system is given by $\widetilde{\mathsf{S}}^{-1}\mathsf{S}$, and the untransformed system is given by $\widetilde{\mathsf{S}}'^{-1} \mathsf{S}'$.
  • Figure 4: Solver diagram for block preconditioners applied to the transformed saddle-point system \ref{['eq:transformed-saddle-point']}.
  • Figure 5: GPU DG mass inverse setup phase.
  • ...and 5 more figures

Theorems & Definitions (12)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Remark 1
  • Remark 2
  • Remark 3
  • Proposition 4
  • ...and 2 more