Stability of the Lanczos Method for Matrix Function Approximation
Cameron Musco, Christopher Musco, Aaron Sidford
TL;DR
This paper proves that finite precision Lanczos essentially matches the exact arithmetic guarantee if computations use roughly $\log(nC\|A\|)$ bits of precision, and raises the question of if convergence in less than $poly(\kappa)$ iterations can be expected in finite precision, even for matrices with clustered, skewed, or otherwise favorable eigenvalue distributions.
Abstract
The ubiquitous Lanczos method can approximate $f(A)x$ for any symmetric $n \times n$ matrix $A$, vector $x$, and function $f$. In exact arithmetic, the method's error after $k$ iterations is bounded by the error of the best degree-$k$ polynomial uniformly approximating $f(x)$ on the range $[λ_{min}(A), λ_{max}(A)]$. However, despite decades of work, it has been unclear if this powerful guarantee holds in finite precision. We resolve this problem, proving that when $\max_{x \in [λ_{min}, λ_{max}]}|f(x)| \le C$, Lanczos essentially matches the exact arithmetic guarantee if computations use roughly $\log(nC\|A\|)$ bits of precision. Our proof extends work of Druskin and Knizhnerman [DK91], leveraging the stability of the classic Chebyshev recurrence to bound the stability of any polynomial approximating $f(x)$. We also study the special case of $f(A) = A^{-1}$, where stronger guarantees hold. In exact arithmetic Lanczos performs as well as the best polynomial approximating $1/x$ at each of $A$'s eigenvalues, rather than on the full eigenvalue range. In seminal work, Greenbaum gives an approach to extending this bound to finite precision: she proves that finite precision Lanczos and the related CG method match any polynomial approximating $1/x$ in a tiny range around each eigenvalue [Gre89]. For $A^{-1}$, this bound appears stronger than ours. However, we exhibit matrices with condition number $κ$ where exact arithmetic Lanczos converges in $polylog(κ)$ iterations, but Greenbaum's bound predicts $Ω(κ^{1/5})$ iterations. It thus cannot offer significant improvement over the $O(κ^{1/2})$ bound achievable via our result. Our analysis raises the question of if convergence in less than $poly(κ)$ iterations can be expected in finite precision, even for matrices with clustered, skewed, or otherwise favorable eigenvalue distributions.
