A multiscale cavity method for sublinear-rank symmetric matrix factorization

Jean Barbier; Justin Ko; Anas A. Rahman

A multiscale cavity method for sublinear-rank symmetric matrix factorization

Jean Barbier, Justin Ko, Anas A. Rahman

TL;DR

This work proves that in the spiked Wigner model with rank growing sublinearly as $M={\rm o}(\sqrt{\ln N})$, the limiting mutual information coincides with the rank-one RS variational formula, effectively reducing a growing-rank problem to a scalar optimization. The authors introduce a multiscale cavity method to handle two growing dimensions and establish a rank-one reduction for all SNR via information-theoretic and convexity arguments, enabled by an overlap-concentration perturbation and Nishimori identities. They provide a Guerra interpolation-based lower bound and a multiscale cavity-based upper bound that match, yielding the limit $\lim_{N\to\infty}F_N(\lambda)=\sup_{q\in[0,\rho]}F_1^{RS}(q,\lambda)$, and derive the associated MMSE relationship. The approach highlights a pathway to extend replica-symmetric analyses to broad, large-array inference problems with dimension-dependent coordinates, with potential reach beyond symmetric matrix factorization to tensors and asymmetric variants.

Abstract

We consider a statistical model for symmetric matrix factorization with additive Gaussian noise in the high-dimensional regime where the rank $M$ of the signal matrix to infer scales with its size $N$ as $M={\rm o}(\sqrt{\ln N})$. Allowing for an $N$-dependent rank offers new challenges and requires new methods. Working in the Bayes-optimal setting, we show that whenever the signal has i.i.d.~entries, the limiting mutual information between signal and data is given by a variational formula involving a rank-one replica symmetric potential. In other words, from the information-theoretic perspective, the case of a (slowly) growing rank is the same as when $M=1$ (namely, the standard spiked Wigner model). The proof is primarily based on a novel multiscale cavity method allowing for growing rank along with some information-theoretic identities on worst noise for the vector Gaussian channel. We believe that the cavity method developed here will play a role in the analysis of a broader class of inference and spin models where the degrees of freedom are large arrays instead of vectors.

A multiscale cavity method for sublinear-rank symmetric matrix factorization

TL;DR

This work proves that in the spiked Wigner model with rank growing sublinearly as

, the limiting mutual information coincides with the rank-one RS variational formula, effectively reducing a growing-rank problem to a scalar optimization. The authors introduce a multiscale cavity method to handle two growing dimensions and establish a rank-one reduction for all SNR via information-theoretic and convexity arguments, enabled by an overlap-concentration perturbation and Nishimori identities. They provide a Guerra interpolation-based lower bound and a multiscale cavity-based upper bound that match, yielding the limit

, and derive the associated MMSE relationship. The approach highlights a pathway to extend replica-symmetric analyses to broad, large-array inference problems with dimension-dependent coordinates, with potential reach beyond symmetric matrix factorization to tensors and asymmetric variants.

Abstract

We consider a statistical model for symmetric matrix factorization with additive Gaussian noise in the high-dimensional regime where the rank

of the signal matrix to infer scales with its size

. Allowing for an

-dependent rank offers new challenges and requires new methods. Working in the Bayes-optimal setting, we show that whenever the signal has i.i.d.~entries, the limiting mutual information between signal and data is given by a variational formula involving a rank-one replica symmetric potential. In other words, from the information-theoretic perspective, the case of a (slowly) growing rank is the same as when

(namely, the standard spiked Wigner model). The proof is primarily based on a novel multiscale cavity method allowing for growing rank along with some information-theoretic identities on worst noise for the vector Gaussian channel. We believe that the cavity method developed here will play a role in the analysis of a broader class of inference and spin models where the degrees of freedom are large arrays instead of vectors.

Paper Structure (19 sections, 27 theorems, 280 equations)

This paper contains 19 sections, 27 theorems, 280 equations.

Introduction
Setting and main results
The sublinear-rank spiked Wigner model and the replica symmetric potential
Equivalence of the suprema of the rank-$M$ and rank-one replica symmetric potentials
Bounding the limiting free entropy via the interpolation and multiscale cavity methods
Relating the rank-$M$ variational formula to its rank-one analog
Information-theoretic inequalities on worst Gaussian noise
Properties of maximizers of the replica symmetric potential
Rank-one reduction for high, low, and all SNR
Standard prerequisites: Interpolation and overlap concentration
Free entropy lower bound: Guerra interpolation
Thermal concentration of the overlap matrix
Negligibility of the side information
Treatment of growing rank: The multiscale cavity method
Free entropy upper bound: Multiscale Aizenman--Sims--Starr identity
...and 4 more sections

Key Result

Theorem 1

Assume the following hypotheses: Setting $\rho:=\mathbb{E}_{\mathop{\mathrm{\mathbb{P}}}\nolimits_X}X^2$, the limiting free entropy FrenEnt of the spiked Wigner model spikedwignermodel is then given in terms of the replica symmetric potential F1RSpot by As a consequence, we have the following formula for the limiting minimum mean-square error of the spiked Wigner model: where $q^*(\lambda):=\ma

Theorems & Definitions (55)

Theorem 1: Rank-one replica formula for the growing-rank spiked Wigner model
Remark 1: Properties of the replica symmetric potential
Lemma 1: Off-diagonal entries of the noise covariance bolster information
Corollary 1: Worst Gaussian noise with covariance of fixed trace
Proposition 1: Properties of $\bm{Q}^*(\lambda)$
Theorem 2: Rank-one reduction
Proposition 2: Free entropy lower bound
Proposition 3: Free entropy upper bound
Theorem 3: Multiscale Aizenman--Sims--Starr identity
Remark 2
...and 45 more

A multiscale cavity method for sublinear-rank symmetric matrix factorization

TL;DR

Abstract

A multiscale cavity method for sublinear-rank symmetric matrix factorization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (55)