Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning
Michał Dereziński, Christopher Musco, Jiaming Yang
TL;DR
This work introduces Multi-level Sketched Preconditioning (MSP), a deterministic, sketch-based framework for solving linear systems by building low-rank Nyström preconditioners from sparse random sketches and inverting them through additional sketching levels. The key idea is that the convergence depends on an average tail condition number, enabling faster runtimes when the matrix has only a few large singular values; MSP yields a near-optimal $ ilde{O}(n^{2.065} ext{log}^3(1/ u) + k^{oldsymbol{ ame}})$ time solver for such systems and extends to regularized problems and matrix-norm estimation. In particular, MSP achieves $ ilde{O}(n^2 ext{log}^3(1/ u) + d_{oldsymbol{ ame}}^{oldsymbol{ ame}})$ for PSD cases, and $ ilde{O}(n^{2.11})$-time Schatten $1$-norm estimation, improving over prior stochastic approaches. The framework includes rigorous stability analysis of inexact preconditioned Lanczos iterations and a three-level extension to general linear systems, supported by detailed cost analyses and spectral guarantees for the inner solves. Overall, MSP provides a unifying, deterministic, sketch-based methodology that leverages average-condition-number decay to outperform previous stochastic or power-iteration based solvers in a broad class of problems including kernel ridge regression and spectral-numern estimation. The results advance both the theory and practice of fast linear-system and matrix-norm computations in the real RAM model, with practical implications for large-scale machine learning and numerical linear algebra tasks.
Abstract
We present a new class of preconditioned iterative methods for solving linear systems of the form $Ax = b$. Our methods are based on constructing a low-rank Nyström approximation to $A$ using sparse random matrix sketching. This approximation is used to construct a preconditioner, which itself is inverted quickly using additional levels of random sketching and preconditioning. We prove that the convergence of our methods depends on a natural average condition number of $A$, which improves as the rank of the Nyström approximation increases. Concretely, this allows us to obtain faster runtimes for a number of fundamental linear algebraic problems: 1. We show how to solve any $n\times n$ linear system that is well-conditioned except for $k$ outlying large singular values in $\tilde{O}(n^{2.065} + k^ω)$ time, improving on a recent result of [Dereziński, Yang, STOC 2024] for all $k \gtrsim n^{0.78}$. 2. We give the first $\tilde{O}(n^2 + {d_λ}^ω$) time algorithm for solving a regularized linear system $(A + λI)x = b$, where $A$ is positive semidefinite with effective dimension $d_λ=\mathrm{tr}(A(A+λI)^{-1})$. This problem arises in applications like Gaussian process regression. 3. We give faster algorithms for approximating Schatten $p$-norms and other matrix norms. For example, for the Schatten 1-norm (nuclear norm), we give an algorithm that runs in $\tilde{O}(n^{2.11})$ time, improving on an $\tilde{O}(n^{2.18})$ method of [Musco et al., ITCS 2018]. All results are proven in the real RAM model of computation. Interestingly, previous state-of-the-art algorithms for most of the problems above relied on stochastic iterative methods, like stochastic coordinate and gradient descent. Our work takes a completely different approach, instead leveraging tools from matrix sketching.
