Table of Contents
Fetching ...

High-performance matrix-free unfitted finite element operator evaluation

Maximilian Bergbauer, Peter Munch, Wolfgang A. Wall, Martin Kronbichler

TL;DR

The paper develops a matrix-free framework for high-order unfitted finite element operator evaluation on tensor-product hexahedral meshes embedded via level-set geometry, addressing the small cut cell problem with volume ghost penalties and unstructured quadrature through dimension-reduction techniques. By classifying quadrature into structured and unstructured terms and applying sum-factorization, the authors achieve significant throughput gains over traditional sparse-matrix approaches, with strong performance shown for $p=3$ DG and scalable large-scale simulations. Performance models and CPU benchmarks demonstrate promising roofline alignment and substantial speedups, validating the practicality of high-order unfitted methods in 3D. The work also discusses load balancing and preconditioning as critical factors for further improvements and scalability in real-world simulations.

Abstract

Unfitted finite element methods, like CutFEM, have traditionally been implemented in a matrix-based fashion, where a sparse matrix is assembled and later applied to vectors while solving the resulting linear system. With the goal of increasing performance and enabling algorithms with polynomial spaces of higher degrees, this contribution chooses a more abstract approach by matrix-free evaluation of the operator action on vectors instead. The proposed method loops over cells and locally evaluates the cell, face, and interface integrals, including the contributions from cut cells and the different means of stabilization. The main challenge is the efficient numerical evaluation of terms in the weak form with unstructured quadrature points arising from the unfitted discretization in cells cut by the interface. We present design choices and performance optimizations for tensor-product elements and demonstrate the performance by means of benchmarks and application examples. We demonstrate a speedup of more than one order of magnitude for the operator evaluation of a discontinuous Galerkin discretization with polynomial degree three compared to a sparse matrix-vector product and develop performance models to quantify the performance properties over a wide range of polynomial degrees.

High-performance matrix-free unfitted finite element operator evaluation

TL;DR

The paper develops a matrix-free framework for high-order unfitted finite element operator evaluation on tensor-product hexahedral meshes embedded via level-set geometry, addressing the small cut cell problem with volume ghost penalties and unstructured quadrature through dimension-reduction techniques. By classifying quadrature into structured and unstructured terms and applying sum-factorization, the authors achieve significant throughput gains over traditional sparse-matrix approaches, with strong performance shown for DG and scalable large-scale simulations. Performance models and CPU benchmarks demonstrate promising roofline alignment and substantial speedups, validating the practicality of high-order unfitted methods in 3D. The work also discusses load balancing and preconditioning as critical factors for further improvements and scalability in real-world simulations.

Abstract

Unfitted finite element methods, like CutFEM, have traditionally been implemented in a matrix-based fashion, where a sparse matrix is assembled and later applied to vectors while solving the resulting linear system. With the goal of increasing performance and enabling algorithms with polynomial spaces of higher degrees, this contribution chooses a more abstract approach by matrix-free evaluation of the operator action on vectors instead. The proposed method loops over cells and locally evaluates the cell, face, and interface integrals, including the contributions from cut cells and the different means of stabilization. The main challenge is the efficient numerical evaluation of terms in the weak form with unstructured quadrature points arising from the unfitted discretization in cells cut by the interface. We present design choices and performance optimizations for tensor-product elements and demonstrate the performance by means of benchmarks and application examples. We demonstrate a speedup of more than one order of magnitude for the operator evaluation of a discontinuous Galerkin discretization with polynomial degree three compared to a sparse matrix-vector product and develop performance models to quantify the performance properties over a wide range of polynomial degrees.
Paper Structure (20 sections, 32 equations, 11 figures, 2 algorithms)

This paper contains 20 sections, 32 equations, 11 figures, 2 algorithms.

Figures (11)

  • Figure 1: Classification of weak formulation terms in quadrature on respective cell/face type: volume quadrature (red), face quadrature (orange) and surface quadrature (blue); structured (tensor-product) quadrature (dots) and unstructured quadrature (circles).
  • Figure 1: Structured cell and face tensor-product quadrature evaluation using sum-factorization for a 2D element with $p=2$. \ref{['subfig:structured_cell']} Interpolation from DoF values into tensor-product quadrature points using the tensor-product structure of the shape functions. \ref{['subfig:structured_face']} Interpolation from cell DoF values into face DoF values, then in-face interpolation into quadrature points.
  • Figure 1: Convergence of 3D sphere benchmark, $p=1$(\ref{['pgfplots:p1']}),$2$(\ref{['pgfplots:p2']}),$3$(\ref{['pgfplots:p3']}),$4$(\ref{['pgfplots:p4']})
  • Figure 1: Measurements (full line) vs. estimates (dashed line): Characteristics of memory transfer and arithmetic operations for sparse matrix \ref{['pgfplots:sparsematrix']}, structured quadrature algorithm \ref{['pgfplots:structured']} and unstructured quadrature algorithm \ref{['pgfplots:unstructured']}
  • Figure 1: Sphere benchmark cases
  • ...and 6 more figures

Theorems & Definitions (4)

  • Remark 3.1
  • Remark 3.2
  • Remark 3.3
  • Remark 5.1