Table of Contents
Fetching ...

Sharp error bounds for approximate eigenvalues and singular values from subspace methods

Irina-Beatrice Haas, Yuji Nakatsukasa

TL;DR

The paper develops sharp quadratic error bounds for Ritz eigenvalues derived from subspace methods, showing that the error | \\lambda_i - \\theta_i| scales like the square of the corresponding residual divided by a robust spectral gap, | \\lambda_i - \\theta_i| \\le c \\|E_i\\|_2^2 / \\text{Gap}_i, with c \ ightarrow 1 as the residuals vanish. The approach exploits the structured perturbation inherent in Rayleigh-Ritz and extends to singular values via the Jordan-Wielandt theorem, yielding analogous bounds for SVD components. The results are adapted to well-separated Ritz values as well as clusters, and the asymptotic sharpness is established, demonstrating improvements over classical bounds in practical, large-scale computations. Numerical experiments with Krylov methods (e.g., Lanczos, LOBPCG) and randomized SVD (HMT) validate the bounds and show they are tight and computable from available residual and gap information, supporting their use as reliable error certificates in large-scale eigenvalue and singular value computations.

Abstract

Subspace methods are commonly used for finding approximate eigenvalues and singular values of large-scale matrices. Once a subspace is found, the Rayleigh-Ritz method (for symmetric eigenvalue problems) and Petrov-Galerkin projection (for singular values) are the de facto method for extraction of eigenvalues and singular values. In this work we derive quadratic error bounds for approximate eigenvalues of symmetric matrices obtained via the Rayleigh-Ritz process. Our bounds take advantage of the fact that extremal eigenpairs tend to converge faster than the rest, hence having smaller residuals $\|A\widehat x_i-θ_i\widehat x_i\|_2$, where $(θ_i,\widehat x_i)$ is a Ritz pair (approximate eigenpair). The proof uses the structure of the perturbation matrix underlying the Rayleigh-Ritz method to bound the components of its eigenvectors. In this way, we obtain a bound of the form $c\frac{\|A\widehat x_i-θ_i\widehat x_i\|_2^2}{\mbox{Gap}_i}$, where $\mbox{Gap}_i$ is roughly the gap between the $i$th Ritz value and the eigenvalues that are not approximated by the Ritz process, and $c> 1$ is a modest scalar. Our bound is adapted to each Ritz value and is robust to clustered Ritz values, which is a key improvement over existing results. We further show that the bound is asymptotically sharp, and generalize it to singular values of arbitrary real matrices. Finally, we apply these bounds to several methods for computing eigenvalues and singular values, and illustrate the sharpness of our bounds in a number of computational settings, including Krylov methods and randomized algorithms.

Sharp error bounds for approximate eigenvalues and singular values from subspace methods

TL;DR

The paper develops sharp quadratic error bounds for Ritz eigenvalues derived from subspace methods, showing that the error | \\lambda_i - \\theta_i| scales like the square of the corresponding residual divided by a robust spectral gap, | \\lambda_i - \\theta_i| \\le c \\|E_i\\|_2^2 / \\text{Gap}_i, with c \ ightarrow 1 as the residuals vanish. The approach exploits the structured perturbation inherent in Rayleigh-Ritz and extends to singular values via the Jordan-Wielandt theorem, yielding analogous bounds for SVD components. The results are adapted to well-separated Ritz values as well as clusters, and the asymptotic sharpness is established, demonstrating improvements over classical bounds in practical, large-scale computations. Numerical experiments with Krylov methods (e.g., Lanczos, LOBPCG) and randomized SVD (HMT) validate the bounds and show they are tight and computable from available residual and gap information, supporting their use as reliable error certificates in large-scale eigenvalue and singular value computations.

Abstract

Subspace methods are commonly used for finding approximate eigenvalues and singular values of large-scale matrices. Once a subspace is found, the Rayleigh-Ritz method (for symmetric eigenvalue problems) and Petrov-Galerkin projection (for singular values) are the de facto method for extraction of eigenvalues and singular values. In this work we derive quadratic error bounds for approximate eigenvalues of symmetric matrices obtained via the Rayleigh-Ritz process. Our bounds take advantage of the fact that extremal eigenpairs tend to converge faster than the rest, hence having smaller residuals , where is a Ritz pair (approximate eigenpair). The proof uses the structure of the perturbation matrix underlying the Rayleigh-Ritz method to bound the components of its eigenvectors. In this way, we obtain a bound of the form , where is roughly the gap between the th Ritz value and the eigenvalues that are not approximated by the Ritz process, and is a modest scalar. Our bound is adapted to each Ritz value and is robust to clustered Ritz values, which is a key improvement over existing results. We further show that the bound is asymptotically sharp, and generalize it to singular values of arbitrary real matrices. Finally, we apply these bounds to several methods for computing eigenvalues and singular values, and illustrate the sharpness of our bounds in a number of computational settings, including Krylov methods and randomized algorithms.

Paper Structure

This paper contains 16 sections, 4 theorems, 60 equations, 6 figures.

Key Result

Lemma 1

Let $A_0$ and $F$ be symmetric matrices. Denote by $\lambda_i(t)$ the $i$th eigenvalue of $A_0+tF$ such that $(A_0+tF)x(t)=\lambda_i(t)x(t)$ where $\|x(t)\|_2=1$ for $t\in [0,1]$. If $\lambda_i(t)$ is simple, then

Figures (6)

  • Figure 1: Error in Ritz values $|\theta_i-\lambda_i|$ and error bounds for uniformly distributed eigenvalues: $\lambda_i=i, \: \forall i\in [1,n]$.
  • Figure 2: Error in Ritz values $|\theta_i-\lambda_i|$ and error bounds for uniformly distributed eigenvalues ($\lambda_i=i$) and a cluster of 10 eigenvalues at $\lambda_0=20$ (left) and $\lambda_0=100$ (right).
  • Figure 3: Error $|\theta_i-\lambda_i|$ and bounds for approximate eigenvalues obtained with the Lanczos algorithm. The eigenvalues of $A$ that were approximated here are $\lambda_i \in [1,20]$. Some data points are above the bounds but only because of the limited machine precision used in the experiment.
  • Figure 4: Error $|\sigma_i-\theta_i|$ and bounds for geometrically distributed singular values where the trial subspaces were found with a single power iteration. Left: estimation with Petrov-Galerkin approximation ; right: estimation with randomized SVD.
  • Figure 5: Error $|\sigma_i-\theta_i|$ and bounds for geometrically distributed singular values where the trial subspaces were found with double power iteration. Left: estimation with Petrov-Galerkin approximation ; right: estimation with randomized SVD. In some cases the bounds lie below the actual error; this is due to roundoff errors (which our bounds do not account for). Indeed, the effect of round-off errors was not accounted for in the perturbation matrix $\bar{A}$ (from \ref{['eq:RR_svd_structure']}) considered in our analysis. The operations (e.g. orthogonal multiplication) involved in both methods above are backward stable, so in finite precision arithmetic, we can consider that $\sigma_i$ is actually the exact singular value of $\bar{A}+E_u$ with $\|E_u\|_2=\mathcal{O}(u\|A\|_2)$. Therefore, Weyl's theorem implies that the contribution of round-off errors in $|\sigma_i-\theta_i|$ is fortunately bounded by $\mathcal{O}(u\|A\|_2)$. That is, even in finite precision arithmetic, the bounds can be trusted up to working precision.
  • ...and 1 more figures

Theorems & Definitions (7)

  • Lemma 1
  • Theorem 1
  • Proof 1
  • Theorem 2
  • Proof 2
  • Theorem 3
  • Proof 3