The matrix-vector complexity of $Ax=b$

Michał Dereziński; Ethan N. Epperly; Raphael A. Meyer

The matrix-vector complexity of $Ax=b$

Michał Dereziński, Ethan N. Epperly, Raphael A. Meyer

TL;DR

This work establishes tight worst-case lower bounds on the matrix-vector complexity of solving linear systems via matvecs, distinguishing two- and one-sided access to the matrix. For two-sided algorithms, it proves a lower bound of $\Omega\big(\frac{1-\eta}{4}\; \kappa \log(1/\varepsilon)\big)$ matvecs to obtain an $\varepsilon$-accurate solution, matching the performance of conjugate gradient on the normal equations up to a factor of about four. For one-sided, transpose-free algorithms, it proves an $\Omega(n)$ matvecs lower bound (with precise constants) even when the input is perfectly conditioned, via a novel hidden Haar theorem and a reduction to bilinear power sequences. The results imply a fundamental complexity separation between matvec-based solvers and full matrix-access algorithms, clarifying why Krylov methods appear optimal in the worst case and providing a fine-grained picture of when speedups beyond matvecs are unlikely. The techniques connect polynomial approximability of $f(x)$ on spectral domains to matrix-function hardness and introduce a new information-theoretic handle for transpose-free linear algebra, with implications for both theory and practical solver design.

Abstract

Matrix-vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix-vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for a matrix-vector algorithm which can perform products with both a matrix and its transpose, $Ω(κ\log(1/\varepsilon))$ matrix-vector products are necessary to solve a linear system with condition number $κ$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix-vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

The matrix-vector complexity of $Ax=b$

TL;DR

matvecs to obtain an

-accurate solution, matching the performance of conjugate gradient on the normal equations up to a factor of about four. For one-sided, transpose-free algorithms, it proves an

matvecs lower bound (with precise constants) even when the input is perfectly conditioned, via a novel hidden Haar theorem and a reduction to bilinear power sequences. The results imply a fundamental complexity separation between matvec-based solvers and full matrix-access algorithms, clarifying why Krylov methods appear optimal in the worst case and providing a fine-grained picture of when speedups beyond matvecs are unlikely. The techniques connect polynomial approximability of

on spectral domains to matrix-function hardness and introduce a new information-theoretic handle for transpose-free linear algebra, with implications for both theory and practical solver design.

Abstract

matrix-vector products are necessary to solve a linear system with condition number

to accuracy

, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use

matrix-vector products to solve an

linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

Paper Structure (40 sections, 26 theorems, 115 equations)

This paper contains 40 sections, 26 theorems, 115 equations.

Introduction
Background and research questions
Our results
Our Techniques.
Implication: Fine-grained lower bounds and complexity separation
Related work
Upper and lower bounds for Krylov methods.
Transpose-free linear algebra.
Quantum linear systems.
Notation
Lower bounds for two-sided algorithms
Matrix--vector complexity and polynomial approximability
Spectral sums are easier than matrix-function-times-vector
Lower bound for estimating a spectral sum
Lower bounds for trace-inverse
...and 25 more sections

Key Result

Theorem 1.1

Fix $\eta>0$. There does not exist any algorithm that takes inputs $\kappa$, $\varepsilon$, and $\bm{b}$, which computes fewer than $\frac{1-\eta}{4}\kappa\log(1/\varepsilon)$ two-sided matrix--vector products with $\bm{A}$, and which returns a vector $\tilde{\bm{x}\xspace}$ such that for all numbers $\varepsilon>0$ and $\kappa \geq 1$ and matrices $\bm{A}$ of condition number $\mathop{\mathrm{co

Theorems & Definitions (48)

Theorem 1.1: Linear systems: Lower bound against two-sided algorithms
Theorem 1.2: Linear systems: Lower bound against one-sided algorithms
Theorem 1.3: Linear systems: Fine-grained lower bound
Proposition 2.1: Matrix-function-vector product: Upper bound
Proposition 2.2: From matrix-function-times-vector to spectral sum
Definition 2.3: Inapproximable function
Theorem 2.4: Spectral sum: Lower bound, special case of \ref{['thm:black-box-matvec-lower-bound-generic']}
Lemma 2.5: Inapproximability of $1/x$ on split interval
Theorem 2.6: Trace-inverse: Lower bound
proof
...and 38 more

The matrix-vector complexity of $Ax=b$

TL;DR

Abstract

The matrix-vector complexity of $Ax=b$

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (48)