Table of Contents
Fetching ...

Optimal tuning-free convex relaxation for noisy matrix completion

Yuepeng Yang, Cong Ma

TL;DR

We study noisy matrix completion under uniform sampling and $mu$-incoherence and introduce a tuning-free square-root matrix completion estimator, square-root MC, whose regularization parameter scales as $\lambda \asymp 1/\sqrt{n}$. The authors establish minimax-optimal Frobenius and entrywise error bounds that depend on the condition number $\kappa$, incoherence $\mu$, rank $r$, sampling probability $p$, and noise level $\sigma$, with high probability. A key methodological contribution is linking the convex square-root MC to a smooth nonconvex reformulation via a new variable $\theta$, and proving that an approximate stationary point of the nonconvex problem lies close to the ground truth and to the convex solution; leave-one-out techniques are used to control the iterates. Overall, the tuning-free estimator achieves optimal statistical performance without knowledge of the noise size, offering practical advantages and providing a blueprint for connecting convex and nonconvex approaches in high-dimensional recovery.

Abstract

This paper is concerned with noisy matrix completion--the problem of recovering a low-rank matrix from partial and noisy entries. Under uniform sampling and incoherence assumptions, we prove that a tuning-free square-root matrix completion estimator (square-root MC) achieves optimal statistical performance for solving the noisy matrix completion problem. Similar to the square-root Lasso estimator in high-dimensional linear regression, square-root MC does not rely on the knowledge of the size of the noise. While solving square-root MC is a convex program, our statistical analysis of square-root MC hinges on its intimate connections to a nonconvex rank-constrained estimator.

Optimal tuning-free convex relaxation for noisy matrix completion

TL;DR

We study noisy matrix completion under uniform sampling and -incoherence and introduce a tuning-free square-root matrix completion estimator, square-root MC, whose regularization parameter scales as . The authors establish minimax-optimal Frobenius and entrywise error bounds that depend on the condition number , incoherence , rank , sampling probability , and noise level , with high probability. A key methodological contribution is linking the convex square-root MC to a smooth nonconvex reformulation via a new variable , and proving that an approximate stationary point of the nonconvex problem lies close to the ground truth and to the convex solution; leave-one-out techniques are used to control the iterates. Overall, the tuning-free estimator achieves optimal statistical performance without knowledge of the noise size, offering practical advantages and providing a blueprint for connecting convex and nonconvex approaches in high-dimensional recovery.

Abstract

This paper is concerned with noisy matrix completion--the problem of recovering a low-rank matrix from partial and noisy entries. Under uniform sampling and incoherence assumptions, we prove that a tuning-free square-root matrix completion estimator (square-root MC) achieves optimal statistical performance for solving the noisy matrix completion problem. Similar to the square-root Lasso estimator in high-dimensional linear regression, square-root MC does not rely on the knowledge of the size of the noise. While solving square-root MC is a convex program, our statistical analysis of square-root MC hinges on its intimate connections to a nonconvex rank-constrained estimator.
Paper Structure (41 sections, 10 theorems, 117 equations, 5 figures, 2 algorithms)

This paper contains 41 sections, 10 theorems, 117 equations, 5 figures, 2 algorithms.

Key Result

Theorem 1

Suppose that Assumptions assumption:p-assumption:incoherence hold. In addition, assume that the sample size and the noise level satisfy for some sufficient large (resp. small) constant $C_{\mathrm{sample}}>0$ (resp. $C_{\mathrm{noise}}>0$). Set $\lambda=C_{\lambda}/\sqrt{n}$ for the square-root MC estimator eq:sqrt-mc, where $C_{\lambda}$ is some large absoulute constant (e.g., 32). With probabil

Figures (5)

  • Figure 1: (a) Relative estimation error of $\bm{L}_{\mathrm{cvx}}$ vs. noise size $\sigma$ on a log-log scale, where we fix $n=500,r=5,p=0.5$; (b) Relative estimation error of $\bm{L}_{\mathrm{cvx}}$ vs. problem size $\sqrt{n}$, where we fix $r=5,\sigma=10^{-4},p=0.5$; (c) Relative estimation error of $\bm{L}_{\mathrm{cvx}}$ vs. observation probability $p$ on a log-log scale, where we fix $n=2000, r=5,\sigma=10^{-4}$. For all three plots, $\lambda=4/\sqrt{n}$ and each point represents the average of 20 independent trials.
  • Figure 2: Relative Frobenius estimation error of convex and nonconvex solutions and their distance. The parameters are chosen as: $n=200,r=5,p=0.5$ while $\sigma$ varies from $10^{-5}$ to $10^{-3}$.
  • Figure 3: (a) Relative Frobenius estimation error of square-root MC and solution of \ref{['eq:mc']} with oracle and cross-validated $\lambda$ vs. problem size $\sqrt{n}$. The parameters are fixed as $\sigma=10^{-4}, r=5, p=0.5$. (b) Relative Frobenius estimation error of square-root MC and solution of \ref{['eq:mc']} with oracle and cross-validated $\lambda$ vs. noise size $\sigma$ on a log-log scale. The parameters are fixed as $n=400, r=5, p=0.5$. In both settings, $k = 10$ for the number of folds in cross validation and each point represents the average of 10 independent trials.
  • Figure 4: Relative Frobenius estimation error of square-root MC and solution of \ref{['eq:mc']} with oracle and cross-validated $\lambda$ vs. $\gamma$ for approximately low rank matrices. The parameters are chosen as: $n = 400,r=5,p=0.5, \sigma = 10^{-4}, \lambda = 2/\sqrt{n}$ while $\gamma$ varies from $0$ to $0.2$. Each point represents the average of 10 independent trials.
  • Figure 5: Relative Frobenius estimation error of square-root MC and \ref{['eq:nonconvex-opt-theta']} for approximately low rank matrices. The parameters are chosen as: $n = 400,r=5,p=0.5, \sigma=10^{-4}, \lambda = 2/\sqrt{n}$ while $\gamma$ varies from $0$ to $0.2$. Each point represents the average of 10 independent trials.

Theorems & Definitions (12)

  • Theorem 1
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • proof
  • Lemma 6
  • Lemma 7
  • Remark 1
  • ...and 2 more