Table of Contents
Fetching ...

The Algorithmic Phase Transition in Correlated Spiked Models

Zhangsong Li

TL;DR

The paper tackles the problem of detecting and recovering correlated signals in paired spiked matrices (Wigner and Wishart). It introduces a counting-based algorithm using edge-decorated cycles that achieves strong detection and weak recovery above a sharp computation threshold F(λ,μ,ρ,γ) > 1, and provides matching lower bounds via low-degree polynomial arguments to indicate a precise computational phase transition at F = 1. By formulating approximate statistics through color-coding, the authors obtain polynomial-time algorithms with provable guarantees, and they contrast their thresholds with classical PLS/CCA benchmarks in multi-dataset inference. The work thus establishes a concrete statistical-computational gap for correlated spiked models and offers a general methodology for leveraging inter-dataset correlations to surpass limits observed when analyzing each dataset separately.

Abstract

We study the computational task of detecting and estimating correlated signals in a pair of spiked matrices $$ X=\tfracλ{\sqrt{n}} xu^{\top}+W, \quad Y=\tfracμ{\sqrt{n}} yv^{\top}+Z $$ where the spikes $x,y$ have correlation $ρ$. Specifically, we consider two fundamental models: (1) Correlated spiked Wigner model with signal-to-noise ratio $λ,μ$; (2) Correlated spiked $n*N$ Wishart (covariance) model with signal-to-noise ratio $\sqrtλ,\sqrtμ$. We propose an efficient detection and estimation algorithm based on counting a specific family of edge-decorated cycles. The algorithm's performance is governed by the function $$ F(λ,μ,ρ,γ)=\max\Big\{ \frac{ λ^2 }{ γ}, \frac{ μ^2 }{ γ}, \frac{ λ^2 ρ^2 }{ γ-λ^2+λ^2 ρ^2 } + \frac{ μ^2 ρ^2 }{ γ-μ^2+μ^2 ρ^2 } \Big\} \,. $$ We prove our algorithm succeeds for the correlated spiked Wigner model whenever $F(λ,μ,ρ,1)>1$, and succeeds for the correlated spiked Wishart model whenever $F(λ,μ,ρ,\tfrac{n}{N})>1$. Our result shows that an algorithm can leverage the correlation between the spikes to detect and estimate the signals even in regimes where efficiently recovering either $x$ from ${X}$ alone or $y$ from ${Y}$ alone is believed to be computationally infeasible. We complement our algorithmic results with evidence for a matching computational lower bound. In particular, we prove that when $F(λ,μ,ρ,1)<1$ for the correlated spiked Wigner model and when $F(λ,μ,ρ,\tfrac{n}{N})<1$ for the spiked Wishart model, all algorithms based on low-degree polynomials fails to distinguish $({X},{Y})$ with two independent noise matrices. This strongly suggests that $F=1$ is the precise computation threshold for our models.

The Algorithmic Phase Transition in Correlated Spiked Models

TL;DR

The paper tackles the problem of detecting and recovering correlated signals in paired spiked matrices (Wigner and Wishart). It introduces a counting-based algorithm using edge-decorated cycles that achieves strong detection and weak recovery above a sharp computation threshold F(λ,μ,ρ,γ) > 1, and provides matching lower bounds via low-degree polynomial arguments to indicate a precise computational phase transition at F = 1. By formulating approximate statistics through color-coding, the authors obtain polynomial-time algorithms with provable guarantees, and they contrast their thresholds with classical PLS/CCA benchmarks in multi-dataset inference. The work thus establishes a concrete statistical-computational gap for correlated spiked models and offers a general methodology for leveraging inter-dataset correlations to surpass limits observed when analyzing each dataset separately.

Abstract

We study the computational task of detecting and estimating correlated signals in a pair of spiked matrices where the spikes have correlation . Specifically, we consider two fundamental models: (1) Correlated spiked Wigner model with signal-to-noise ratio ; (2) Correlated spiked Wishart (covariance) model with signal-to-noise ratio . We propose an efficient detection and estimation algorithm based on counting a specific family of edge-decorated cycles. The algorithm's performance is governed by the function We prove our algorithm succeeds for the correlated spiked Wigner model whenever , and succeeds for the correlated spiked Wishart model whenever . Our result shows that an algorithm can leverage the correlation between the spikes to detect and estimate the signals even in regimes where efficiently recovering either from alone or from alone is believed to be computationally infeasible. We complement our algorithmic results with evidence for a matching computational lower bound. In particular, we prove that when for the correlated spiked Wigner model and when for the spiked Wishart model, all algorithms based on low-degree polynomials fails to distinguish with two independent noise matrices. This strongly suggests that is the precise computation threshold for our models.

Paper Structure

This paper contains 59 sections, 63 theorems, 353 equations, 5 figures.

Key Result

Theorem 1

Suppose that $n=\gamma N$ for some $\gamma=\Theta(1)$ and $\bm Y$ are defined in Definition def-spiked-covariance. When $\lambda^2 \leq \gamma$, the top eigenvalue $\varsigma_1(\frac{1}{N}\bm Y\bm Y^{\top})$ remains within the bulk of the noise spectrum, concentrating at $(1+\sqrt{\gamma})^2$ (match

Figures (5)

  • Figure 1: Phase diagram in the $(\lambda,\mu)$ plane illustrating the thresholds for the subgraph counts method (blue, this work), the PLS method (orange, MZ25+), and the CCA method (green, BHPZ19MY23BG23+). Here we take $\gamma=0.25$ and $\rho=0.99$.
  • Figure 2: An unlabeled graph $[H] \in \mathcal{H}(\ell)$ with $\ell=8$. Here $\mathsf{diff}(H)=\{ v_1,v_3,v_5,v_8 \}$.
  • Figure 3: An unlabeled graph $[H] \in \mathcal{G}(\ell)$ with $\ell=4$. Here $\mathsf{diff}(H)=\{ v_2,v_4 \}$.
  • Figure 4: An unlabeled graph $[H] \in \mathcal{J}(\ell)$ with $\ell=6$. Here $\mathsf{diff}(H)=\{ v_2,v_3,v_4,v_6 \}$.
  • Figure 5: An unlabeled graph $[H] \in \mathcal{I}(\ell)$ with $\ell=4$. Here $\mathsf{diff}(H)=\{ v_2,v_4 \}$.

Theorems & Definitions (101)

  • Definition 1.1: Spiked Wigner model
  • Definition 1.2: Spiked Wishart model
  • Theorem
  • Theorem
  • Definition 1.4: Correlated spikes distribution
  • Definition 1.5: Correlated spiked Wigner model
  • Definition 1.6: Correlated spiked Wishart model
  • Definition 1.7
  • Definition 1.8
  • Theorem 1.9: Informal
  • ...and 91 more