Table of Contents
Fetching ...

Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise

Jean Barbier, Francesco Camilli, Marco Mondelli, Yizhou Xu

TL;DR

This work tackles the problem of inferring a rank-1 spike from a high-dimensional symmetric matrix corrupted by structured noise drawn from a rotationally invariant trace ensemble. By applying the replica method, the authors derive a variational formula for the replica-symmetric free entropy and show, via Nishimori identities, that the resulting fixed-point equations collapse to a single scalar parameter, yielding the Bayes-optimal MMSE and linking the information theory to a Gaussian surrogate with an effective SNR. They further develop an AdaTAP framework that recasts the non-quadratic likelihood into a quadratic surrogate with an Onsager correction, enabling an efficient TAP iteration scheme whose updates hinge on a pre-processing function J(•) derived from the noise potential V. The main contributions are (i) an explicit information-theoretic characterization for spiked matrix models with general trace-ensemble noise, (ii) a rigorous reduction to a Gaussian surrogate with the same fixed point MMSE, and (iii) a practical TAP-based algorithm that attains Bayes-optimal performance in simulations and on real datasets, supporting a form of universality across rotationally invariant noise classes. This work provides both theoretical insight and a scalable algorithm for structured-noise inference with potential impact on high-dimensional covariance estimation and related problems where noise structure matters.

Abstract

We consider a prototypical problem of Bayesian inference for a structured spiked model: a low-rank signal is corrupted by additive noise. While both information-theoretic and algorithmic limits are well understood when the noise is a Gaussian Wigner matrix, the more realistic case of structured noise still proves to be challenging. To capture the structure while maintaining mathematical tractability, a line of work has focused on rotationally invariant noise. However, existing studies either provide sub-optimal algorithms or are limited to special cases of noise ensembles. In this paper, using tools from statistical physics (replica method) and random matrix theory (generalized spherical integrals) we establish the first characterization of the information-theoretic limits for a noise matrix drawn from a general trace ensemble. Remarkably, our analysis unveils the asymptotic equivalence between the rotationally invariant model and a surrogate Gaussian one. Finally, we show how to saturate the predicted statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer (TAP) equations.

Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise

TL;DR

This work tackles the problem of inferring a rank-1 spike from a high-dimensional symmetric matrix corrupted by structured noise drawn from a rotationally invariant trace ensemble. By applying the replica method, the authors derive a variational formula for the replica-symmetric free entropy and show, via Nishimori identities, that the resulting fixed-point equations collapse to a single scalar parameter, yielding the Bayes-optimal MMSE and linking the information theory to a Gaussian surrogate with an effective SNR. They further develop an AdaTAP framework that recasts the non-quadratic likelihood into a quadratic surrogate with an Onsager correction, enabling an efficient TAP iteration scheme whose updates hinge on a pre-processing function J(•) derived from the noise potential V. The main contributions are (i) an explicit information-theoretic characterization for spiked matrix models with general trace-ensemble noise, (ii) a rigorous reduction to a Gaussian surrogate with the same fixed point MMSE, and (iii) a practical TAP-based algorithm that attains Bayes-optimal performance in simulations and on real datasets, supporting a form of universality across rotationally invariant noise classes. This work provides both theoretical insight and a scalable algorithm for structured-noise inference with potential impact on high-dimensional covariance estimation and related problems where noise structure matters.

Abstract

We consider a prototypical problem of Bayesian inference for a structured spiked model: a low-rank signal is corrupted by additive noise. While both information-theoretic and algorithmic limits are well understood when the noise is a Gaussian Wigner matrix, the more realistic case of structured noise still proves to be challenging. To capture the structure while maintaining mathematical tractability, a line of work has focused on rotationally invariant noise. However, existing studies either provide sub-optimal algorithms or are limited to special cases of noise ensembles. In this paper, using tools from statistical physics (replica method) and random matrix theory (generalized spherical integrals) we establish the first characterization of the information-theoretic limits for a noise matrix drawn from a general trace ensemble. Remarkably, our analysis unveils the asymptotic equivalence between the rotationally invariant model and a surrogate Gaussian one. Finally, we show how to saturate the predicted statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer (TAP) equations.
Paper Structure (11 sections, 91 equations, 4 figures, 1 algorithm)

This paper contains 11 sections, 91 equations, 4 figures, 1 algorithm.

Figures (4)

  • Figure 1: The performance of the TAP iterations (dots) matches well the replica prediction for the minimum mean-square error (solid lines), for various distributions of noise eigenvalues (in different plots) and for two signal priors (Gaussian in red and Rademacher in blue). Error bars correspond to the standard deviation over 10 trials. Top left: Quartic potential. Top right: Sestic potential. Bottom left: Marchenko–Pastur distribution of eigenvalues. Bottom right: Truncated normal distribution of eigenvalues. The green dashed lines (which overlap perfectly the red solid lines) denote the theoretical performance of spectral PCA as predicted by benaych2011eigenvalues. We do not include the performance of spectral PCA for the normal distribution of eigenvalues due to numerical instabilities.
  • Figure 2: The performance of the TAP iterations (dots) matches well the replica prediction for the minimum mean-square error (solid lines), for two sparse priors (in different plots) and two distributions of noise eigenvalues (quartic potential in red and Marchenko–Pastur distribution in blue). Error bars correspond to the standard deviation over 10 trials. Left: two-point prior. Right: sparse Rademacher prior.
  • Figure 3: Comparison between the TAP algorithm and the optimal degree-D lifted OAMP algorithm dudeja2024-OAMP on noise matrices derived from the Hapmap3 dataset and Rademacher signals. Solid lines correspond to the state evolution of the optimal degree-D lifted OAMP algorithm and the dashed black line is the replica prediction for the MMSE.
  • Figure 4: Comparison between the TAP iterations for the quartic noise model (with $\lambda=2$) and its information-theoretic equivalent Gaussian surrogate model. The error bars represent the standard deviation computed over 10 trials. The dashed black line represents the MMSE predicted by the replica theory.