Table of Contents
Fetching ...

Matrix Inference in Growing Rank Regimes

Farzad Pourkamali, Jean Barbier, Nicolas Macris

TL;DR

This work analyzes the problem of inferring a large symmetric signal matrix ${\bm S}$ observed through additive Gaussian noise in regimes where the rank grows with dimension as $M=\Theta(N^{\alpha})$, with $0<\alpha<1$ (sub-linear) and $M=\Theta(N)$ (linear). It derives information-theoretic limits (mutual information and MMSE) for two priors in the sub-linear regime—factorized ${\bm S}={\bm X}{\bm X}^T/N$ and rotationally invariant—and shows the sub-linear MI/MMSE collapse to rank-one formulas, while in the linear regime it rigorously computes MI and MMSE for rotationally invariant priors via free probability and the Harish-Chandra–Itzykson–Zuber framework, revealing continuous MMSE in SNR and smoothed phase transitions. The paper introduces algorithmic schemes that achieve MMSE in regimes without computational gaps: Decimation-AMP for factorized sub-linear models and a Sub-linear Rotation Invariant Estimator (RIE) for rotationally invariant priors, with a spectral thresholding flavor. It also establishes a rigorous link between sub-linear and linear regimes, showing a regime boundary at $\alpha=1$ where qualitative changes in inference occur, and connects these results to free probability and spherical integral asymptotics. Overall, the work advances understanding of matrix inference with growing rank, providing explicit MMSE/MI formulas, Bayes-optimal estimators, and practical algorithms across regimes with broad implications for matrix denoising and factorization tasks.

Abstract

The inference of a large symmetric signal-matrix $\mathbf{S} \in \mathbb{R}^{N\times N}$ corrupted by additive Gaussian noise, is considered for two regimes of growth of the rank $M$ as a function of $N$. For sub-linear ranks $M=Θ(N^α)$ with $α\in(0,1)$ the mutual information and minimum mean-square error (MMSE) are derived for two classes of signal-matrices: (a) $\mathbf{S}=\mathbf{X}\mathbf{X}^\intercal$ with entries of $\mathbf{X}\in\mathbb{R}^{N\times M}$ independent identically distributed; (b) $\mathbf{S}$ sampled from a rotationally invariant distribution. Surprisingly, the formulas match the rank-one case. Two efficient algorithms are explored and conjectured to saturate the MMSE when no statistical-to-computational gap is present: (1) Decimation Approximate Message Passing; (2) a spectral algorithm based on a Rotation Invariant Estimator. For linear ranks $M=Θ(N)$ the mutual information is rigorously derived for signal-matrices from a rotationally invariant distribution. Close connections with scalar inference in free probability are uncovered, which allow to deduce a simple formula for the MMSE as an integral involving the limiting spectral measure of the data matrix only. An interesting issue is whether the known information theoretic phase transitions for rank-one, and hence also sub-linear-rank, still persist in linear-rank. Our analysis suggests that only a smoothed-out trace of the transitions persists. Furthermore, the change of behavior between low and truly high-rank regimes only happens at the linear scale $α=1$.

Matrix Inference in Growing Rank Regimes

TL;DR

This work analyzes the problem of inferring a large symmetric signal matrix observed through additive Gaussian noise in regimes where the rank grows with dimension as , with (sub-linear) and (linear). It derives information-theoretic limits (mutual information and MMSE) for two priors in the sub-linear regime—factorized and rotationally invariant—and shows the sub-linear MI/MMSE collapse to rank-one formulas, while in the linear regime it rigorously computes MI and MMSE for rotationally invariant priors via free probability and the Harish-Chandra–Itzykson–Zuber framework, revealing continuous MMSE in SNR and smoothed phase transitions. The paper introduces algorithmic schemes that achieve MMSE in regimes without computational gaps: Decimation-AMP for factorized sub-linear models and a Sub-linear Rotation Invariant Estimator (RIE) for rotationally invariant priors, with a spectral thresholding flavor. It also establishes a rigorous link between sub-linear and linear regimes, showing a regime boundary at where qualitative changes in inference occur, and connects these results to free probability and spherical integral asymptotics. Overall, the work advances understanding of matrix inference with growing rank, providing explicit MMSE/MI formulas, Bayes-optimal estimators, and practical algorithms across regimes with broad implications for matrix denoising and factorization tasks.

Abstract

The inference of a large symmetric signal-matrix corrupted by additive Gaussian noise, is considered for two regimes of growth of the rank as a function of . For sub-linear ranks with the mutual information and minimum mean-square error (MMSE) are derived for two classes of signal-matrices: (a) with entries of independent identically distributed; (b) sampled from a rotationally invariant distribution. Surprisingly, the formulas match the rank-one case. Two efficient algorithms are explored and conjectured to saturate the MMSE when no statistical-to-computational gap is present: (1) Decimation Approximate Message Passing; (2) a spectral algorithm based on a Rotation Invariant Estimator. For linear ranks the mutual information is rigorously derived for signal-matrices from a rotationally invariant distribution. Close connections with scalar inference in free probability are uncovered, which allow to deduce a simple formula for the MMSE as an integral involving the limiting spectral measure of the data matrix only. An interesting issue is whether the known information theoretic phase transitions for rank-one, and hence also sub-linear-rank, still persist in linear-rank. Our analysis suggests that only a smoothed-out trace of the transitions persists. Furthermore, the change of behavior between low and truly high-rank regimes only happens at the linear scale .
Paper Structure (49 sections, 17 theorems, 227 equations, 19 figures, 1 table)

This paper contains 49 sections, 17 theorems, 227 equations, 19 figures, 1 table.

Key Result

Theorem 1

Under assumption weak-conv-emp,

Figures (19)

  • Figure 1: Comparison of the MSE reached by our proposed Sub-linear RIE (red) and Decimation AMP (blue) algorithms for sub-linear matrix inference as a function of the size $N$ for various SNR $\gamma$, compared to the rank-one MMSE (Statement \ref{['statementSubLin']}). Each point represents the average over 10 experiments, with error bars indicating 1 standard deviation. (left) Signal with Gaussian spikes, $M = \lfloor N^{0.5} \rfloor$. (right) Signal with Rademacher spikes, $M = \lfloor N^{0.3} \rfloor$.
  • Figure 2: MMSE for the Rademacher spectral distribution. From left to right: Plot (a) ${\rm MMSE}(\gamma)$ computed from \ref{['asymp-MMSE-th']} and ${\rm MSE}_{\rm N, RIE}(\gamma)$ points computed from \ref{['RIE-est']} for $N=1000$ averaged over 20 runs (error bars are invisible). Plots (b) and (c): first and second derivatives of ${\rm MMSE}(\gamma)$ computed using their integral representation (integral computed numerically). Plots (d) and (e): first and second numerical differentiation of (c). These suggest that the ${\rm MMSE}^{\prime\prime}(\gamma)$ has a vertical tangent at $\gamma_c = 1$, and a possible phase transition (if present) would be $4$-th order. A numerical analysis in appendix \ref{['Example']} is compatible with a weak singularity at $\gamma_c=1$ of the form $(\gamma - 1)^3\ln |\gamma - 1|$.
  • Figure 3: MMSE in the linear-rank regime with sparse spectral priors. The MMSE of the rank-one problem is also plotted for comparison. (left) Signal with Marchenko-Pastur spectral distribution for large $q$'s.The vertical dashed lines corresponds to the critical value where the support of $\rho_Y$ splits. (right) Signal with rank $M = (1-p)N$ and Bernoulli spectral distribution, $\rho_S = p \delta_{0} + (1-p) \delta_{+1}$, for $p$'s close to 1.
  • Figure 4: Comparison of the sub-linear RIE and the oracle estimator \ref{['Oracle-est']} for Gaussian noise with sub-linear Wishart signal with $M = \lfloor \sqrt{N} \rfloor$. Horizontal lines are MMSE computed from \ref{['MSE-Wishart-SubL-MSE']}. Points are averaged over 10 experiments (error bars might be invisible).
  • Figure 5: Comparison of the sub-linear RIE and the oracle estimator \ref{['Oracle-est']} for Gaussian noise with sub-linear signal with $\rho_S = \mathcal{U}([1,2])$. Horizontal lines are MSE computed from \ref{['MSE-Uniform-GNoise-Subl']}. Points are averaged over 10 experiments (error bars might be invisible).
  • ...and 14 more figures

Theorems & Definitions (38)

  • Theorem 1: Mutual Information for linear rank matrix denoising
  • proof
  • Theorem 2: MMSE for linear rank matrix denoising
  • proof
  • Theorem 3: Explicit Mutual Information for linear rank matrix denoising
  • proof
  • proof
  • Remark 1
  • Remark 2
  • Proposition 1
  • ...and 28 more