Table of Contents
Fetching ...

Accelerating multigrid with streaming chiral SVD for Wilson fermions in lattice QCD

Travis Whyte, Andreas Stathopoulos, Eloy Romero

TL;DR

The paper tackles critical slowing-down in lattice QCD solvers for Wilson fermions by enlarging the multigrid test-vector basis and then truncating it with a chiral singular value decomposition (CSVD) to form efficient prolongation/restriction operators, $\bm{P}$ and $\bm{R}$. A streaming variant, iCSVD, mitigates storage costs by updating left singular vectors incrementally across streams while preserving near-null-space quality. Numerical experiments on anisotropic and isotropic lattices ($m_\pi \approx 239$ MeV and $m_\pi \approx 220$ MeV) show consistent speedups, with volume scaling tests indicating robust performance across larger volumes. At near-critical mass $m_q \approx m_{crit}$, iCSVD achieves substantial improvements, including a speedup of about 1.7× (approximately 170%) over regular multigrid, and the approach is positioned for integration with least-squares interpolation in future work.

Abstract

A modification to the setup algorithm for the multigrid preconditioner of Wilson fermions in lattice QCD is presented. A larger basis of test vectors than that used in conventional multigrid is calculated by the smoother and truncated by singular value decomposition on the chiral components of the test vectors. The truncated basis is used to form the prolongation and restriction matrices of the multigrid hierarchy. This modification of the setup method is demonstrated to increase the convergence of linear solvers on an anisotropic lattice with $m_π \approx 239$ MeV from the Hadron Spectrum Collaboration and an isotropic lattice with $m_π \approx 220$ MeV from the MILC Collaboration. The lattice volume dependence of the method is also examined. Increasing the number of test vectors improves speedup up to a point, but storing these vectors becomes impossible in limited memory resources such as GPUs. To address storage cost, we implement a \emph{streaming} singular value decomposition of the basis of test vectors on the chiral components and demonstrate a decrease in the number of fine level iterations by a factor of 1.7 for $m_q \approx m_{crit}$.

Accelerating multigrid with streaming chiral SVD for Wilson fermions in lattice QCD

TL;DR

The paper tackles critical slowing-down in lattice QCD solvers for Wilson fermions by enlarging the multigrid test-vector basis and then truncating it with a chiral singular value decomposition (CSVD) to form efficient prolongation/restriction operators, $\bm{P}$ and $\bm{R}$. A streaming variant, iCSVD, mitigates storage costs by updating left singular vectors incrementally across streams while preserving near-null-space quality. Numerical experiments on anisotropic and isotropic lattices ($m_\pi \approx 239$ MeV and $m_\pi \approx 220$ MeV) show consistent speedups, with volume scaling tests indicating robust performance across larger volumes. At near-critical mass $m_q \approx m_{crit}$, iCSVD achieves substantial improvements, including a speedup of about 1.7× (approximately 170%) over regular multigrid, and the approach is positioned for integration with least-squares interpolation in future work.

Abstract

A modification to the setup algorithm for the multigrid preconditioner of Wilson fermions in lattice QCD is presented. A larger basis of test vectors than that used in conventional multigrid is calculated by the smoother and truncated by singular value decomposition on the chiral components of the test vectors. The truncated basis is used to form the prolongation and restriction matrices of the multigrid hierarchy. This modification of the setup method is demonstrated to increase the convergence of linear solvers on an anisotropic lattice with MeV from the Hadron Spectrum Collaboration and an isotropic lattice with MeV from the MILC Collaboration. The lattice volume dependence of the method is also examined. Increasing the number of test vectors improves speedup up to a point, but storing these vectors becomes impossible in limited memory resources such as GPUs. To address storage cost, we implement a \emph{streaming} singular value decomposition of the basis of test vectors on the chiral components and demonstrate a decrease in the number of fine level iterations by a factor of 1.7 for .

Paper Structure

This paper contains 13 sections, 9 equations, 7 figures, 3 tables, 4 algorithms.

Figures (7)

  • Figure 1: The mean solve time as a function of the initial basis size for Configuration A (left) and Configuration B (right). The bottom and top $y$-axis displays $m$ for levels $\ell = 0,1$, respectively.
  • Figure 2: The singular spectrum for the chirally split test vectors on the first domain for level $\ell = 0$ (top) and $\ell = 1$ (bottom) of Configuration A (left) and Configuration B (right).
  • Figure 3: The mean execution time for the system of linear equations when the degree of the truncation is varied using an initial basis size of $m = 96$ for Configuration A (left) and Configuration B (right). The bottom and top $y$-axis displays $k$ for levels $\ell = 0,1$, respectively.
  • Figure 4: The total number of iterations on $\ell = 0$ (upper left), $\ell = 1$ (upper right), $\ell = 2$ (lower left) and the mean solve time of the system of linear equations for Configuration A.
  • Figure 5: As Figure \ref{['fig::iters_aniso']} for Configuration B.
  • ...and 2 more figures