A Randomized Algorithm for Preconditioner Selection
Conner DiPaolo, Weiqing Gu
TL;DR
The paper tackles the challenging problem of selecting effective preconditioners for iterative solvers by focusing on the preconditioner stability $\|\boldsymbol{I}-\boldsymbol{M}^{-1}\boldsymbol{A}\|_\mathsf{F}$. It introduces a randomized, sketching-based estimator that computes this stability efficiently and proves deterministic impossibility results, establishing randomness as essential. A practical algorithm is then shown to provably identify near-minimal stability among $n$ candidates with cost close to $\mathcal{O}(n\log n)$ CG steps, including parallelization and extensions when a clear winner exists. The authors validate the approach on sparse systems and kernel-regression problems, demonstrating that the method often matches or outperforms the best candidate with manageable overhead and, in kernel regression, yields robust preconditioning where none previously existed. Overall, the work provides a theoretically grounded, scalable framework for preconditioner selection with meaningful impact for large-scale linear systems and data-driven applications.
Abstract
The task of choosing a preconditioner $\boldsymbol{M}$ to use when solving a linear system $\boldsymbol{Ax}=\boldsymbol{b}$ with iterative methods is difficult. For instance, even if one has access to a collection $\boldsymbol{M}_1,\boldsymbol{M}_2,\ldots,\boldsymbol{M}_n$ of candidate preconditioners, it is currently unclear how to practically choose the $\boldsymbol{M}_i$ which minimizes the number of iterations of an iterative algorithm to achieve a suitable approximation to $\boldsymbol{x}$. This paper makes progress on this sub-problem by showing that the preconditioner stability $\|\boldsymbol{I}-\boldsymbol{M}^{-1}\boldsymbol{A}\|_\mathsf{F}$, known to forecast preconditioner quality, can be computed in the time it takes to run a constant number of iterations of conjugate gradients through use of sketching methods. This is in spite of folklore which suggests the quantity is impractical to compute, and a proof we give that ensures the quantity could not possibly be approximated in a useful amount of time by a deterministic algorithm. Using our estimator, we provide a method which can provably select the minimal stability preconditioner among $n$ candidates using floating point operations commensurate with running on the order of $n\log n$ steps of the conjugate gradients algorithm. Our method can also advise the practitioner to use no preconditioner at all if none of the candidates appears useful. The algorithm is extremely easy to implement and trivially parallelizable. In one of our experiments, we use our preconditioner selection algorithm to create to the best of our knowledge the first preconditioned method for kernel regression reported to never use more iterations than the non-preconditioned analog in standard tests.
