Table of Contents
Fetching ...

Coherence-free Entrywise Estimation of Eigenvectors in Low-rank Signal-plus-noise Matrix Models

Hao Yan, Keith Levin

Abstract

Spectral methods are widely used to estimate eigenvectors of a low-rank signal matrix subject to noise. These methods use the leading eigenspace of an observed matrix to estimate this low-rank signal. Typically, the entrywise estimation error of these methods depends on the coherence of the low-rank signal matrix with respect to the standard basis. In this work, we present a novel method for eigenvector estimation that avoids this dependence on coherence. Assuming a rank-one signal matrix, under mild technical conditions, the entrywise estimation error of our method provably has no dependence on the coherence under Gaussian noise (i.e., in the spiked Wigner model), and achieves the optimal estimation rate up to logarithmic factors. Simulations demonstrate that our method performs well under non-Gaussian noise and that an extension of our method to the case of a rank-$r$ signal matrix has little to no dependence on the coherence. In addition, we derive new metric entropy bounds for rank-$r$ singular subspaces under $\ell_{2,\infty}$ distance, which may be of independent interest. We use these new bounds to improve the best known lower bound for rank-$r$ eigenspace estimation under $\ell_{2,\infty}$ distance.

Coherence-free Entrywise Estimation of Eigenvectors in Low-rank Signal-plus-noise Matrix Models

Abstract

Spectral methods are widely used to estimate eigenvectors of a low-rank signal matrix subject to noise. These methods use the leading eigenspace of an observed matrix to estimate this low-rank signal. Typically, the entrywise estimation error of these methods depends on the coherence of the low-rank signal matrix with respect to the standard basis. In this work, we present a novel method for eigenvector estimation that avoids this dependence on coherence. Assuming a rank-one signal matrix, under mild technical conditions, the entrywise estimation error of our method provably has no dependence on the coherence under Gaussian noise (i.e., in the spiked Wigner model), and achieves the optimal estimation rate up to logarithmic factors. Simulations demonstrate that our method performs well under non-Gaussian noise and that an extension of our method to the case of a rank- signal matrix has little to no dependence on the coherence. In addition, we derive new metric entropy bounds for rank- singular subspaces under distance, which may be of independent interest. We use these new bounds to improve the best known lower bound for rank- eigenspace estimation under distance.

Paper Structure

This paper contains 30 sections, 22 theorems, 335 equations, 6 figures, 2 algorithms.

Key Result

Lemma 1

Under Equation eq:signal-plus-noise with Gaussian noise, let $\boldsymbol{M}^\star\! = \!\lambda^\star \boldsymbol{u}^\star {\boldsymbol{u}^\star}^\top$. If both limits $\lim_{n \to \infty} \lambda^\star/(\sigma\sqrt{n}) > \!1$ and $\lim_{n\to\infty} \mu/n$ exist, then for any $\mu \! \in \! [1,n]$,

Figures (6)

  • Figure 1: Estimation error measured in $d_{\infty}$ as a function of dimension $n$, by the leading eigenvector (blue) and Algorithm \ref{['alg:rank:one:simple']} (orange), for $\|\boldsymbol{u}^\star\|_{\infty}$ equal to $0.8, 0.55$ and $0.3$ (dotted, dashed and solid lines, respectively). We consider $\boldsymbol{u}^\star$ generated from the Bernoulli (top row) and Haar (bottom row) schemes, and we consider Gaussian (left), Rademacher (middle) and Laplacian (right) noise.
  • Figure 2: Estimation error under $d_{\infty}$ as a function of size $n$, by the $k$-th eigenvector (blue/purple) and the estimator in Algorithm \ref{['alg:rank:r:simple']} (orange/red) for $\|\boldsymbol{u}^\star\|_{\infty}$ equal to $0.8$ (dotted lines), $0.55$ (dashed lines) or $0.3$ (solid lines) with Gaussian (left), Rademacher (center) or Laplacian (right) noise.
  • Figure 3: Numerical error in recovering the largest entry of $\boldsymbol{u}^\star$ as a function of matrix dimension $n$, by the leading eigenvector (blue line) or the estimator given in Algorithm \ref{['alg:rank:one:simple']} (orange line) for three different choices of $\|\boldsymbol{u}^\star\|_{\infty}$: $0.8$ (dotted lines), $0.55$ (dashed lines) and $0.3$ (solid lines). The plot on the left corresponds to $\boldsymbol{u}^\star$ generated via the Bernoulli scheme, while the plot on the right corresponds to the Haar scheme.
  • Figure 4: Error as measured in $d_{2,\infty}$ as a function of matrix dimension $n$, by spectral estimate (blue) and the estimator in Algorithm \ref{['alg:rank:r:simple']} (orange) for three different choices of $\|{\boldsymbol{U}^\star}\|_{\infty}$: $0.8$ (dotted lines), $0.55$ (dashed lines) and $0.3$ (solid lines). Columns correspond to $\boldsymbol{W}$ being Gaussian (left), Rademacher (center) and Laplacian (right). The rows correspond to the signal matrix having rank-$2$ (top) and rank-$3$ (bottom).
  • Figure 5: Error measured in $d_{\infty}$ as a function of matrix dimension $n$, for the three leading signal eigenvectors $\boldsymbol{u}^\star_k, k=1,2,3$ (line width) by the spectral estimate (blue/purple) or the estimator given in Algorithm \ref{['alg:rank:r:simple']} (orange/red) for three different choices of $\|\boldsymbol{u}^\star\|_{\infty}$: $0.8$ (dotted lines), $0.55$ (dashed lines) and $0.3$ (solid lines). The plots correspond to $\boldsymbol{W}$ being Gaussian (left), Rademacher (center) and Laplacian (right).
  • ...and 1 more figures

Theorems & Definitions (48)

  • Lemma 1
  • Lemma 2
  • Remark 1
  • Remark 2
  • Lemma 3
  • Remark 3
  • Theorem 1
  • Remark 4
  • Lemma 4
  • Remark 5
  • ...and 38 more