Table of Contents
Fetching ...

The matrix-vector complexity of $Ax=b$

Michał Dereziński, Ethan N. Epperly, Raphael A. Meyer

TL;DR

This work establishes tight worst-case lower bounds on the matrix-vector complexity of solving linear systems via matvecs, distinguishing two- and one-sided access to the matrix. For two-sided algorithms, it proves a lower bound of $\Omega\big(\frac{1-\eta}{4}\; \kappa \log(1/\varepsilon)\big)$ matvecs to obtain an $\varepsilon$-accurate solution, matching the performance of conjugate gradient on the normal equations up to a factor of about four. For one-sided, transpose-free algorithms, it proves an $\Omega(n)$ matvecs lower bound (with precise constants) even when the input is perfectly conditioned, via a novel hidden Haar theorem and a reduction to bilinear power sequences. The results imply a fundamental complexity separation between matvec-based solvers and full matrix-access algorithms, clarifying why Krylov methods appear optimal in the worst case and providing a fine-grained picture of when speedups beyond matvecs are unlikely. The techniques connect polynomial approximability of $f(x)$ on spectral domains to matrix-function hardness and introduce a new information-theoretic handle for transpose-free linear algebra, with implications for both theory and practical solver design.

Abstract

Matrix-vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix-vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for a matrix-vector algorithm which can perform products with both a matrix and its transpose, $Ω(κ\log(1/\varepsilon))$ matrix-vector products are necessary to solve a linear system with condition number $κ$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix-vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

The matrix-vector complexity of $Ax=b$

TL;DR

This work establishes tight worst-case lower bounds on the matrix-vector complexity of solving linear systems via matvecs, distinguishing two- and one-sided access to the matrix. For two-sided algorithms, it proves a lower bound of matvecs to obtain an -accurate solution, matching the performance of conjugate gradient on the normal equations up to a factor of about four. For one-sided, transpose-free algorithms, it proves an matvecs lower bound (with precise constants) even when the input is perfectly conditioned, via a novel hidden Haar theorem and a reduction to bilinear power sequences. The results imply a fundamental complexity separation between matvec-based solvers and full matrix-access algorithms, clarifying why Krylov methods appear optimal in the worst case and providing a fine-grained picture of when speedups beyond matvecs are unlikely. The techniques connect polynomial approximability of on spectral domains to matrix-function hardness and introduce a new information-theoretic handle for transpose-free linear algebra, with implications for both theory and practical solver design.

Abstract

Matrix-vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix-vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for a matrix-vector algorithm which can perform products with both a matrix and its transpose, matrix-vector products are necessary to solve a linear system with condition number to accuracy , matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use matrix-vector products to solve an linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix-vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.
Paper Structure (40 sections, 26 theorems, 115 equations)

This paper contains 40 sections, 26 theorems, 115 equations.

Key Result

Theorem 1.1

Fix $\eta>0$. There does not exist any algorithm that takes inputs $\kappa$, $\varepsilon$, and $\bm{b}$, which computes fewer than $\frac{1-\eta}{4}\kappa\log(1/\varepsilon)$ two-sided matrix--vector products with $\bm{A}$, and which returns a vector $\tilde{\bm{x}\xspace}$ such that for all numbers $\varepsilon>0$ and $\kappa \geq 1$ and matrices $\bm{A}$ of condition number $\mathop{\mathrm{co

Theorems & Definitions (48)

  • Theorem 1.1: Linear systems: Lower bound against two-sided algorithms
  • Theorem 1.2: Linear systems: Lower bound against one-sided algorithms
  • Theorem 1.3: Linear systems: Fine-grained lower bound
  • Proposition 2.1: Matrix-function-vector product: Upper bound
  • Proposition 2.2: From matrix-function-times-vector to spectral sum
  • Definition 2.3: Inapproximable function
  • Theorem 2.4: Spectral sum: Lower bound, special case of \ref{['thm:black-box-matvec-lower-bound-generic']}
  • Lemma 2.5: Inapproximability of $1/x$ on split interval
  • Theorem 2.6: Trace-inverse: Lower bound
  • proof
  • ...and 38 more