Table of Contents
Fetching ...

Matrix Denoising with Doubly Heteroscedastic Noise: Fundamental Limits and Optimal Spectral Methods

Yihan Zhang, Marco Mondelli

Abstract

We study the matrix denoising problem of estimating the singular vectors of a rank-$1$ signal corrupted by noise with both column and row correlations. Existing works are either unable to pinpoint the exact asymptotic estimation error or, when they do so, the resulting approaches (e.g., based on whitening or singular value shrinkage) remain vastly suboptimal. On top of this, most of the literature has focused on the special case of estimating the left singular vector of the signal when the noise only possesses row correlation (one-sided heteroscedasticity). In contrast, our work establishes the information-theoretic and algorithmic limits of matrix denoising with doubly heteroscedastic noise. We characterize the exact asymptotic minimum mean square error, and design a novel spectral estimator with rigorous optimality guarantees: under a technical condition, it attains positive correlation with the signals whenever information-theoretically possible and, for one-sided heteroscedasticity, it also achieves the Bayes-optimal error. Numerical experiments demonstrate the significant advantage of our theoretically principled method with the state of the art. The proofs draw connections with statistical physics and approximate message passing, departing drastically from standard random matrix theory techniques.

Matrix Denoising with Doubly Heteroscedastic Noise: Fundamental Limits and Optimal Spectral Methods

Abstract

We study the matrix denoising problem of estimating the singular vectors of a rank- signal corrupted by noise with both column and row correlations. Existing works are either unable to pinpoint the exact asymptotic estimation error or, when they do so, the resulting approaches (e.g., based on whitening or singular value shrinkage) remain vastly suboptimal. On top of this, most of the literature has focused on the special case of estimating the left singular vector of the signal when the noise only possesses row correlation (one-sided heteroscedasticity). In contrast, our work establishes the information-theoretic and algorithmic limits of matrix denoising with doubly heteroscedastic noise. We characterize the exact asymptotic minimum mean square error, and design a novel spectral estimator with rigorous optimality guarantees: under a technical condition, it attains positive correlation with the signals whenever information-theoretically possible and, for one-sided heteroscedasticity, it also achieves the Bayes-optimal error. Numerical experiments demonstrate the significant advantage of our theoretically principled method with the state of the art. The proofs draw connections with statistical physics and approximate message passing, departing drastically from standard random matrix theory techniques.
Paper Structure (31 sections, 37 theorems, 336 equations, 4 figures)

This paper contains 31 sections, 37 theorems, 336 equations, 4 figures.

Key Result

Proposition 4.1

The fixed point equation eqn:fp_it always has a trivial solution $(0, 0)$. There exists a non-trivial solution $(q_u^*, q_v^*)\in{\mathbb R}_{>0}^2$ if and only if in which case the non-trivial solution is unique.

Figures (4)

  • Figure 1: Top two singular values of $A^*$ in \ref{['eqn:A*_main']}, where $d=4000, \delta = 4$ and each simulation is averaged over $10$ i.i.d. trials. The singular values computed experimentally ('sim' in the legends and $\times$ in the plots) closely match our theoretical prediction in \ref{['eqn:sing_char']} ('thy' in the legends and solid curves with the same color in the plots). The threshold $\lambda^*$ is such that equality holds in \ref{['eqn:thr_bayes']}. We note that the green curve corresponding to $\sigma_2^*$ is smaller than $1$ for $\lambda>\lambda^*$, i.e., when \ref{['eqn:thr_bayes']} holds.
  • Figure 2: Performance comparison when $\Xi = I_n$ and $\Sigma$ is a circulant matrix. The numerical results closely follow the predictions of \ref{['thm:spec']}, and our spectral estimators in \ref{['eqn:def_spec']} outperform all other methods (Leeb--Romanov, OptShrink, ScreeNOT, and HeteroPCA), especially at low SNR.
  • Figure 3: Performance comparison when $\Xi$ is a Toeplitz matrix and $\Sigma$ is circulant. The numerical results closely follow the predictions of \ref{['thm:spec']}, and our spectral estimators in \ref{['eqn:def_spec']} outperform all other methods (Leeb, OptShrink, and ScreeNOT), especially at low SNR.
  • Figure 4: Spectra of $A$ and $A^*$ averaged over $10$ i.i.d. trials, where $d = 4000, \delta = 4$. An outlier singular value emerges in the spectrum of $A^*$ due to the pre-processing on $A$.

Theorems & Definitions (64)

  • Proposition 4.1
  • Theorem 4.2
  • Corollary 4.3
  • Theorem 4.4: Free energy
  • Remark 4.1: Equivalent models
  • Remark 4.2: Gaussian priors
  • Theorem 5.1
  • Remark 5.1: Assumptions
  • Remark 5.2: Signal priors
  • Corollary 5.2
  • ...and 54 more