Table of Contents
Fetching ...

Matrix denoising: Bayes-optimal estimators via low-degree polynomials

Guilhem Semerjian

Abstract

We consider the additive version of the matrix denoising problem, where a random symmetric matrix $S$ of size $n$ has to be inferred from the observation of $Y=S+Z$, with $Z$ an independent random matrix modeling a noise. For prior distributions of $S$ and $Z$ that are invariant under conjugation by orthogonal matrices we determine, using results from first and second order free probability theory, the Bayes-optimal (in terms of the mean square error) polynomial estimators of degree at most $D$, asymptotically in $n$, and show that as $D$ increases they converge towards the estimator introduced by Bun, Allez, Bouchaud and Potters in [IEEE Transactions on Information Theory 62, 7475 (2016)]. We conjecture that this optimality holds beyond strictly orthogonally invariant priors, and provide partial evidences of this universality phenomenon when $S$ is an arbitrary Wishart matrix and $Z$ is drawn from the Gaussian Orthogonal Ensemble, a case motivated by the related extensive rank matrix factorization problem.

Matrix denoising: Bayes-optimal estimators via low-degree polynomials

Abstract

We consider the additive version of the matrix denoising problem, where a random symmetric matrix of size has to be inferred from the observation of , with an independent random matrix modeling a noise. For prior distributions of and that are invariant under conjugation by orthogonal matrices we determine, using results from first and second order free probability theory, the Bayes-optimal (in terms of the mean square error) polynomial estimators of degree at most , asymptotically in , and show that as increases they converge towards the estimator introduced by Bun, Allez, Bouchaud and Potters in [IEEE Transactions on Information Theory 62, 7475 (2016)]. We conjecture that this optimality holds beyond strictly orthogonally invariant priors, and provide partial evidences of this universality phenomenon when is an arbitrary Wishart matrix and is drawn from the Gaussian Orthogonal Ensemble, a case motivated by the related extensive rank matrix factorization problem.
Paper Structure (40 sections, 149 equations, 10 figures)

This paper contains 40 sections, 149 equations, 10 figures.

Figures (10)

  • Figure 1: Illustration of the non-crossing partition of Eq (\ref{['eq_NC']}); for clarity the block containing 1 has been drawn above the horizontal axis, the other blocks below. Here $p+1=13$, the block containing the first element has cardinality $m=4$ and reads $\{1,4,5,9\}$, corresponding to $j_1=2$, $j_2=0$, $j_3=3$, the length of the successive intervals it does not cover. Because of the non-crossing condition the other blocks of the partitions decompose into non-crossing partitions of the intervals not covered, of lenght $j_1=2$, $j_2=0$, $j_3=3$, $p+1-m-j_1-j_2-j_3=4$.
  • Figure 2: The curves of ${\rm MMSE}^{(D)}$ as a function of $\Delta$, for $\alpha=1$ (left panel) and $\alpha=5$ (right panel), different colors corresponding to different values of $D$; the black curve labeled BABP is ${\rm MSE}_{\rm BABP}$, the large $D$ limit of ${\rm MMSE}^{(D)}$. The insets present the ratios ${\rm MMSE}^{(D)}/{\rm MSE}_{\rm BABP}$, with the same color code than in the main plots.
  • Figure 3: The density $\rho_Y$ of the observation matrix (left panel), and the optimal denoising function $\mathcal{D}_{\rm BABP}$ along with its low degree approximations $\mathcal{D}^{(D)}$ (right panel), for $\alpha=1$ and $\Delta=0.2$.
  • Figure 4: The density $\rho_Y$ of the observation matrix (left panel), and the optimal denoising function $\mathcal{D}_{\rm BABP}$ along with its low degree approximations $\mathcal{D}^{(D)}$ (right panel), for $\alpha=5$ and $\Delta=0.2$.
  • Figure 5: The graphs in $\mathcal{A}_{\rm od}^{(2)}$ that contribute to the off-diagonal estimators at order 2, in presence of the inversion symmetry. White (resp. black) circles represent the marked (resp. unmarked) vertices. The corresponding formulas for these four estimators can be found in equation (\ref{['eq_babcd']}).
  • ...and 5 more figures