The matrix-vector complexity of $Ax=b$
Michał Dereziński, Ethan N. Epperly, Raphael A. Meyer
TL;DR
This work establishes tight worst-case lower bounds on the matrix-vector complexity of solving linear systems via matvecs, distinguishing two- and one-sided access to the matrix. For two-sided algorithms, it proves a lower bound of $\Omega\big(\frac{1-\eta}{4}\; \kappa \log(1/\varepsilon)\big)$ matvecs to obtain an $\varepsilon$-accurate solution, matching the performance of conjugate gradient on the normal equations up to a factor of about four. For one-sided, transpose-free algorithms, it proves an $\Omega(n)$ matvecs lower bound (with precise constants) even when the input is perfectly conditioned, via a novel hidden Haar theorem and a reduction to bilinear power sequences. The results imply a fundamental complexity separation between matvec-based solvers and full matrix-access algorithms, clarifying why Krylov methods appear optimal in the worst case and providing a fine-grained picture of when speedups beyond matvecs are unlikely. The techniques connect polynomial approximability of $f(x)$ on spectral domains to matrix-function hardness and introduce a new information-theoretic handle for transpose-free linear algebra, with implications for both theory and practical solver design.
Abstract
Matrix-vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix-vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for a matrix-vector algorithm which can perform products with both a matrix and its transpose, $Ω(κ\log(1/\varepsilon))$ matrix-vector products are necessary to solve a linear system with condition number $κ$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix-vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.
