Table of Contents
Fetching ...

Analysis of singular subspaces under random perturbations

Ke Wang

TL;DR

This work analyzes perturbations of singular vectors and subspaces for a low-rank matrix under Gaussian noise, extending classical Davis–Kahan–Wedin theory to all normalized unitarily invariant norms through an isotropic local law. It introduces a stochastic sin$\Theta$ framework and sharp entrywise and $\ell_{2,\infty}$ bounds with improved rank dependence and relaxed gap conditions, applicable to both unitarily invariant norms and weighted norms. The results yield concise, provable performance guarantees for spectral algorithms in Gaussian mixture models and submatrix localization, highlighting practical impact for clustering and detection problems in high dimensions. The methodology accommodates sub-Gaussian extensions and broad norm classes, broadening the toolkit for non-asymptotic perturbation analysis in random matrices.

Abstract

We present a comprehensive analysis of singular vector and singular subspace perturbations in the signal-plus-noise matrix model with random Gaussian noise. Assuming a low-rank signal matrix, we extend the Davis-Kahan-Wedin theorem in a fully generalized manner, applicable to any unitarily invariant matrix norm, building on previous results by O'Rourke, Vu, and the author. Our analysis provides fine-grained insights, including $\ell_\infty$ bounds for singular vectors, $\ell_{2, \infty}$ bounds for singular subspaces, and results for linear and bilinear functions of singular vectors. Additionally, we derive $\ell_{2,\infty}$ bounds on perturbed singular vectors, taking into account the weighting by their corresponding singular values. Finally, we explore practical implications of these results in the Gaussian mixture model and the submatrix localization problem.

Analysis of singular subspaces under random perturbations

TL;DR

This work analyzes perturbations of singular vectors and subspaces for a low-rank matrix under Gaussian noise, extending classical Davis–Kahan–Wedin theory to all normalized unitarily invariant norms through an isotropic local law. It introduces a stochastic sin framework and sharp entrywise and bounds with improved rank dependence and relaxed gap conditions, applicable to both unitarily invariant norms and weighted norms. The results yield concise, provable performance guarantees for spectral algorithms in Gaussian mixture models and submatrix localization, highlighting practical impact for clustering and detection problems in high dimensions. The methodology accommodates sub-Gaussian extensions and broad norm classes, broadening the toolkit for non-asymptotic perturbation analysis in random matrices.

Abstract

We present a comprehensive analysis of singular vector and singular subspace perturbations in the signal-plus-noise matrix model with random Gaussian noise. Assuming a low-rank signal matrix, we extend the Davis-Kahan-Wedin theorem in a fully generalized manner, applicable to any unitarily invariant matrix norm, building on previous results by O'Rourke, Vu, and the author. Our analysis provides fine-grained insights, including bounds for singular vectors, bounds for singular subspaces, and results for linear and bilinear functions of singular vectors. Additionally, we derive bounds on perturbed singular vectors, taking into account the weighting by their corresponding singular values. Finally, we explore practical implications of these results in the Gaussian mixture model and the submatrix localization problem.
Paper Structure (45 sections, 37 theorems, 497 equations, 9 figures, 2 algorithms)

This paper contains 45 sections, 37 theorems, 497 equations, 9 figures, 2 algorithms.

Key Result

Theorem 1

Let $\widetilde{A} = A + E$ as in def:tildeA. Then for any unitarily invariant norm $\vvvert \cdot \vvvert$,

Figures (9)

  • Figure 1: CDF plots of $\sin \angle ( u_1, \widetilde{u}_1)$ across 300 trials. The signal matrix $A\in \mathbb{R}^{n\times n}$ has rank $r$. The noise matrix $E$ has i.i.d. standard Gaussian entries. We set the largest singular value of $A$ as $\sigma_1=100$. Left: We set $n=400$, $\delta_1=20$ and $\sigma_2=\ldots=\sigma_r=80$, varying $r=5, 20, 60, 100, 200$. Right: We fix $r=20$, $\delta_1=40$, and $\sigma_2=\ldots=\sigma_r=60$, and vary the dimension $n=400, 800, 1200.$
  • Figure 2: Empirical exploration of rank dependence in singular vector perturbation. For each rank $r \in \{50, 55, 60, \ldots, 600\}$, we simulate 100 independent trials of a rank-$r$ signal matrix $A$ corrupted by Gaussian noise and compute the 10th percentile of $\sin \angle(u_1, \widetilde{u}_1)$. We set $n=800$. The singular values of $A$ are given as follows: $\sigma_1 = 150$, $\sigma_2 = \sigma_1 - 20= 130$ with $\delta_1=20$, and the remaining $r-2$ nonzero singular values decay linearly down to 120. The figure compares three scaling laws, e.g., $\sqrt{r}$, $r^{1/3}$, and $\log(r)$, via linear fits and $R^2$ values, showing that $\sqrt{r}$ provides the best empirical fit ($R^2 = 0.9986$).
  • Figure 3: Empirical exploration of rank dependence in singular vector perturbation: $\ell_{\infty}$ norm. For each rank $r \in \{10, 30, \ldots, 350\}$, we simulate 100 independent trials of a rank-$r$ signal matrix $A$ corrupted by Gaussian noise and compute the 10th percentile of $\min_{\mathsf{s}\in\{\pm 1\}} \|u_1 - \mathsf{s} \widetilde{u}_1\|_\infty$. We set $n = 400$. The singular values of $A$ are given as follows: $\sigma_1 = 1000$, $\sigma_2 = 980$ with $\delta_1=20$, and the remaining $r-2$ nonzero singular values decay linearly down to 950. The figure compares three scaling laws, $r$, $\sqrt{r}$, and $r^{1/3}$, via linear fits and $R^2$ values, showing that $\sqrt{r}$ provides the best empirical fit ($R^2 = 0.9965$).
  • Figure 4: Empirical exploration of rank dependence in singular subspace perturbation: Frobenius norm. For each rank $r \in \{20, 40, 60,\ldots, 350\}$, we simulate 100 independent trials of a rank-$r$ signal matrix $A$ corrupted by Gaussian noise and compute the 10th percentile of $\|\sin \angle(U_{10}, \widetilde{U}_{10})\|_F$. We set $n = 400$. The singular values of $A$ are given as follows: $\sigma_1 = \cdots = \sigma_{10} = 1000$, and the remaining $r-10$ nonzero singular values decay linearly down to 950. The figure compares three scaling laws, $\sqrt{r + \log(2n)}$, $\sqrt{r}$, and $r$, via linear fits and $R^2$ values, showing that $\sqrt{r}$ provides the best empirical fit ($R^2 = 0.9528$).
  • Figure 5: Empirical exploration of rank dependence in singular subspace perturbation: Frobenius norm. For each rank $r \in \{20, 40, 60,\ldots, 350\}$, we simulate 100 independent trials of a rank-$r$ signal matrix $A$ corrupted by Gaussian noise and compute the 10th percentile of $\|\sin \angle(U_{50}, \widetilde{U}_{50})\|_F$. We set $n = 400$. The singular values of $A$ are given as follows: $\sigma_1 = \cdots = \sigma_{50} = 1000$, and the remaining $r-50$ nonzero singular values decay linearly down to 950. The figure compares three scaling laws, $\sqrt{r + \log(2n)}$, $\sqrt{r}$, and $r$, via linear fits and $R^2$ values, showing that $\sqrt{r}$ provides the best empirical fit ($R^2 = 0.9696$).
  • ...and 4 more figures

Theorems & Definitions (44)

  • Theorem 1: Mirsky
  • Theorem 2: Wedin Wedin
  • Theorem 3: Unitarily invariant norms: simplified asymptotic version
  • Corollary 4
  • Theorem 5: Unitarily invariant norms: Gaussian noise
  • Remark 6
  • Remark 7
  • Theorem 8: Singular value bounds: general noise
  • Theorem 9: Singular subspace bounds: general noise
  • Theorem 10: $\ell_\infty$ and $\ell_{2, \infty}$ bounds: simplified asymptotic version
  • ...and 34 more