Table of Contents
Fetching ...

Matrix Completion with Graph Information: A Provable Nonconvex Optimization Approach

Yao Wang, Yiyang Yang, Kaidong Wang, Shanxing Gao, Xiuwu Liao

TL;DR

This work addresses matrix completion with graph side information by proposing GSGD, a provable nonconvex optimization method that incorporates higher-order graph smoothness through preconditioned gradient updates. It introduces a graph-aware regularization via $L_W,L_H$ derived from inverse-graph operators, along with a graph incoherence-based projection and a graph spectral initialization to ensure fast, linear convergence with near-optimal sample complexity. Theoretical results provide contraction and linear convergence guarantees under a graph quality measure $\psi$ and incoherence, complemented by practical efficiency: per-iteration complexity scales with observed entries and graph sparsity, enabling large-scale recovery. Empirical results on synthetic and real-world data demonstrate superior recovery accuracy and scalability compared with state-of-the-art graph-regularized and graph-agnostic methods, and show robustness to false edges. The approach offers a principled, scalable framework for exploiting higher-order graph structure in matrix completion and related graph-regularized recovery problems.

Abstract

We consider the problem of matrix completion with graphs as side information depicting the interrelations between variables. The key challenge lies in leveraging the similarity structure of the graph to enhance matrix recovery. Existing approaches, primarily based on graph Laplacian regularization, suffer from several limitations: (1) they focus only on the similarity between neighboring variables, while overlooking long-range correlations; (2) they are highly sensitive to false edges in the graphs and (3) they lack theoretical guarantees regarding statistical and computational complexities. To address these issues, we propose in this paper a novel graph regularized matrix completion algorithm called GSGD, based on preconditioned projected gradient descent approach. We demonstrate that GSGD effectively captures the higher-order correlation information behind the graphs, and achieves superior robustness and stability against the false edges. Theoretically, we prove that GSGD achieves linear convergence to the global optimum with near-optimal sample complexity, providing the first theoretical guarantees for both recovery accuracy and efficacy in the perspective of nonconvex optimization. Our numerical experiments on both synthetic and real-world data further validate that GSGD achieves superior recovery accuracy and scalability compared with several popular alternatives.

Matrix Completion with Graph Information: A Provable Nonconvex Optimization Approach

TL;DR

This work addresses matrix completion with graph side information by proposing GSGD, a provable nonconvex optimization method that incorporates higher-order graph smoothness through preconditioned gradient updates. It introduces a graph-aware regularization via derived from inverse-graph operators, along with a graph incoherence-based projection and a graph spectral initialization to ensure fast, linear convergence with near-optimal sample complexity. Theoretical results provide contraction and linear convergence guarantees under a graph quality measure and incoherence, complemented by practical efficiency: per-iteration complexity scales with observed entries and graph sparsity, enabling large-scale recovery. Empirical results on synthetic and real-world data demonstrate superior recovery accuracy and scalability compared with state-of-the-art graph-regularized and graph-agnostic methods, and show robustness to false edges. The approach offers a principled, scalable framework for exploiting higher-order graph structure in matrix completion and related graph-regularized recovery problems.

Abstract

We consider the problem of matrix completion with graphs as side information depicting the interrelations between variables. The key challenge lies in leveraging the similarity structure of the graph to enhance matrix recovery. Existing approaches, primarily based on graph Laplacian regularization, suffer from several limitations: (1) they focus only on the similarity between neighboring variables, while overlooking long-range correlations; (2) they are highly sensitive to false edges in the graphs and (3) they lack theoretical guarantees regarding statistical and computational complexities. To address these issues, we propose in this paper a novel graph regularized matrix completion algorithm called GSGD, based on preconditioned projected gradient descent approach. We demonstrate that GSGD effectively captures the higher-order correlation information behind the graphs, and achieves superior robustness and stability against the false edges. Theoretically, we prove that GSGD achieves linear convergence to the global optimum with near-optimal sample complexity, providing the first theoretical guarantees for both recovery accuracy and efficacy in the perspective of nonconvex optimization. Our numerical experiments on both synthetic and real-world data further validate that GSGD achieves superior recovery accuracy and scalability compared with several popular alternatives.

Paper Structure

This paper contains 26 sections, 15 theorems, 107 equations, 6 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

The projection of $\widetilde{F}$ in (projection operator) has the following closed-form solution: $\mathcal{P}_B(\widetilde{F}) = [ (L_W^{-\frac{1}{2}}\mathcal{W})^T, (L_H^{-\frac{1}{2}}\mathcal{H})^T]^T,$ where each row of matrices $\mathcal{W}\in \mathbb{R}^{m\times r}$ and $\mathcal{H}\in

Figures (6)

  • Figure 1: Several representative graphs and the corresponding bar plots of $|(I_M - \mathcal{A})_{1:}|$, i.e., the first row of matrix $|(I_M - \mathcal{A})|$, with different values of $\lambda$, where different colors and numbers represent the node identifiers..
  • Figure 2: The recovery RMSE and trajectory with respect to iterations of ScaledGD, RGD and GSGD on the toy matrix factorization problem, where the color of the trajectory points changes gradually with the number of iterations from $1$ to $100$. Left two: $x_1 = x_2 = 1$; right two: $x_1 = 2$, $x_2 = 1$.
  • Figure 3: (a): the magnitudes of $\text{distance} := \text{distance}_\text{standard} - \text{distance}_\text{graph}$ in $500$ synthetic experiments. (b)(c): the recovery RMSE and iteration trajectories of GSGD with standard spectral and graph spectral initialization, where the color of the trajectory points changes gradually with the number of iterations from $1$ to $100$. (d): Curve of the ratio of spectral norms $\frac{\|\mathcal{A}X\mathcal{B} - X \|_\text{op}}{\|X\|_\text{op}}$ with respect to the proportion of false edges in the graphs.
  • Figure 4: (a)(b): The test RMSE of various algorithms with respect to iteration count for cases with noise-free and noisy observations, where the Y-axis is set to a logarithmic scale for clarity. (c)(d): The synthetic experiments in the presence of false edges with proportion = 5% and 20%, respectively.
  • Figure 5: The RMSE and the number of iterations required to achieve RMSE $\leq 0.05$ (noise-free case) and $0.1$ (noisy case) under sampling rate $p = 5\%, 10\%, 15\%, 20\%$. Left two: noise-free observations; Right two: noisy observations.
  • ...and 1 more figures

Theorems & Definitions (34)

  • Definition 1: Standard incoherence, chen2015incoherence
  • Definition 2: Graph incoherence
  • Definition 3: Graph-aware distance metric
  • Proposition 1
  • Definition 4: Standard spectral initialization
  • Definition 5: Graph spectral initialization
  • Definition 6: Graph quality measure
  • Theorem 1: Property of new projection operator
  • Theorem 2: Linear convergence of the iterates
  • Theorem 3: Graph spectral initialization
  • ...and 24 more