Multigrid with Linear Storage Complexity
Daniel Bauer, Nils Kohl, Stephen F. McCormick, Rasmus Tamstorf
TL;DR
This work tackles the memory bottleneck in discretization-error accurate PDE solvers by developing a full multigrid method that stores the solution and intermediates in a compact, regressive-precision format, achieving $O(n)$ storage instead of the conventional $O(n\log n)$. The method uses a compact representation across multigrid levels, intertwined with a compact full approximation scheme and matrix-free operations to preserve linear arithmetic cost. Key contributions include a rigorous precision framework, memory-optimized algorithms for residuals and corrections, and a detailed cost analysis showing linear storage and near-linear computation. Numerical experiments on Poisson and biharmonic models show the solution can be stored with as few as a handful of bits per DoF while maintaining discretization-error accuracy, implying substantial memory savings on memory-constrained hardware. The approach promises scalable PDE solving on next-generation HPC systems, with potential extensions to parallel implementations and broader PDE classes.
Abstract
As the discretization error for the solution of a partial differential equation (PDE) decreases, the precision required to store the corresponding coefficients naturally increases. Storing the solution's finite element coefficients explicitly requires $\mathcal O(n \log n)$ bits of storage, where $n$ is the number of degrees of freedom (DoFs). This paper presents a full multigrid method to compute the solution in a compressed format that reduces the storage complexity of the solution and intermediate vectors to $\mathcal O(n)$ bits. This reduction allows a matrix-free implementation to solve elliptic PDEs with an overall linear space complexity. For problems limited by the memory capacity of current supercomputers, we expect a memory footprint reduction of about an order of magnitude compared to state-of-the-art mixed-precision methods. We demonstrate the applicability of our algorithm by solving two model problems. Depending on the PDE and polynomial degree, but irrespective of the problem size, the solution vector on the finest grid requires between 4 and 12 bits per DoF, and the residual and correction require 3 to 6 bits each. Additional data is stored on the coarse grids with modestly increasing bit widths toward coarser grids.
