BBP Phase Transition for a Doubly Sparse Deformed Model

Ioana Dumitriu; JD Flynn; Zhichao Wang

BBP Phase Transition for a Doubly Sparse Deformed Model

Ioana Dumitriu, JD Flynn, Zhichao Wang

Abstract

We prove the equivalent of the Baik, Ben Arous, Péché (2004) phenomenon for a novel, doubly sparse model where both the Wigner noise matrix and signal vector(s) are sparse. Specifically, we consider a deformed sub-Gaussian sparse Wigner ensemble with a fixed number of sub-Gaussian spike vectors of the same-order sparsity added. We show that spike vectors with signals greater than one are correlated with the top eigenvectors of the deformed ensemble and that each spike vector of signal greater than one induces an outlier eigenvalue. Notably, our results hold in the supercritical sparsity regime for the Wigner matrix ($q \gg \frac{\log n}{n}$) and for any sparse spike vector with an unbounded number of entries ($np\to \infty$). No further relationship between the sparsities of the noise matrix ($q$) and spike vectors ($p$) is necessary. This generalizes the work of Benaych-Georges and Nadakuditi (2010) and Péché (2005).

BBP Phase Transition for a Doubly Sparse Deformed Model

Abstract

) and for any sparse spike vector with an unbounded number of entries (

). No further relationship between the sparsities of the noise matrix (

) and spike vectors (

) is necessary. This generalizes the work of Benaych-Georges and Nadakuditi (2010) and Péché (2005).

Paper Structure (46 sections, 17 theorems, 129 equations, 4 figures)

This paper contains 46 sections, 17 theorems, 129 equations, 4 figures.

Keywords: sparse PCA, sparse Wigner, deformed random matrix models, signal recovery
Introduction
Overview
Spiked Random Matrix Models.
BBP Phase Transition.
BBP Phenomenon with Altered Parameters.
Sparse PCA.
Doubly Sparse PCA.
Doubly Sparse Literature Review
Random Matrix Theory Tools.
Similar Sparse Models.
Application: Planted Clique Model.
Notation
Model Setup
The Spikes
...and 31 more sections

Key Result

Theorem 4

Given Assumption p_assumption, we have the following behavior for the extremal eigenvalues of $X$ in model for all fixed $1\le i\le r$: in probability. More specifically, for any $\theta_i>1$ and $\gamma>0$, there exist some positive constants $c_1, C_2, C_3, C_5, C_7, C_8$, and $D>0$ such that holds, where $\bm f_r(n)$ is defined by eq:rate_func and

Figures (4)

Figure 1: The dense spike ($p=1/2$) and sparse noise ($q=\frac{(\log n)^2}{n}$) case, when $r=1$ and $n=15000$. (a) The eigenvalue distribution of $X$ compared to the theoretical prediction of the limiting eigenvalue distribution and the location of the outlier eigenvalue. Here $\theta_1 = 3$. (b) The alignment of top eigenvectors with the spike vector $\langle u_1(X),v_1 \rangle^2$ for various signal-to-noise ratios $\theta_1$. Each experiment is run 30 times to obtain an average.
Figure 2: The sparse spike ($p=\frac{(\log n)^2}{n}$) and moderately dense noise ($q=1/2$) case, when $r=1$ and $n=15000$. (a) The eigenvalue distribution of $X$ compared to the theoretical prediction of the limiting eigenvalue distribution and the location of the outlier eigenvalue. Here $\theta_1 = 3$. (b) The alignment of top eigenvectors with the spike vector $\langle u_1(X),v_1 \rangle^2$ for various signal-to-noise ratios $\theta_1$. We take an average over 30 runs.
Figure 3: The sparse spike and sparse noise ($p=q=\frac{(\log n)^2}{n}$) case, when $r=1$ and $n=15000$. (a) The eigenvalue distribution of $X$ compared to the theoretical prediction of the limiting eigenvalue distribution and the location of the outlier eigenvalue. Here $\theta_1 = 3$. (b) The alignment of top eigenvectors with the spike vector $\langle u_1(X),v_1 \rangle^2$ for various signal-to-noise ratios $\theta_1$. We take an average over 30 runs.
Figure 4: The sparse spike and sparse noise ($p=\frac{3(\log n)^2}{2n}, q=\frac{5(\log n)^3}{n}$) case, when $r=2$ and $n=20000$. (a) The eigenvalue distribution of $X$ compared to the theoretical prediction of the limiting eigenvalue distribution and the location of the outlier eigenvalue. Here $\theta_1 = 5, \theta_2=4$. (b) The plots of the entries of top eigenvectors with unit norm, $u_1(X),u_2(X),u_3(X)$ from (a). Notice that the first two eigenvectors corresponding to spikes are localized and sparse because of the sparse signals from Theorem \ref{['eigenvector_theorem']}; while the third eigenvector from the bulk is delocalized.

Theorems & Definitions (22)

Definition 1
Definition 2
Definition 3: Probability Bound Function
Theorem 4: Doubly Sparse BBP: Eigenvalues
Corollary 5: Distinguishability
Corollary 6
Theorem 7: Doubly Sparse BBP: Eigenvectors
Remark
Corollary 8: Weak Recovery
Lemma 9
...and 12 more

BBP Phase Transition for a Doubly Sparse Deformed Model

Abstract

BBP Phase Transition for a Doubly Sparse Deformed Model

Authors

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (22)