Matrix Completion with Hypergraphs:Sharp Thresholds and Efficient Algorithms
Zhongtian Ma, Qiaosheng Zhang, Zhen Wang
TL;DR
This work addresses exact matrix completion from a sub-sampled rating matrix augmented with observed social graphs and hypergraphs. It introduces MCH, a three-stage algorithm that uses spectral clustering on the hypergraph-augmented graph, majority-rule rating vector estimation, and iterative local refinement to achieve exact recovery. A key result is a sharp threshold for the sample probability $p$, expressed as $p^*= ext{max}igigrac{ ext{log terms}}{I_ heta ext{...}},rac{K ext{log} m}{I_ heta n}igig$, below which exact recovery is information-theoretically impossible and above which MCH succeeds with high probability; this threshold decreases as hypergraph quality (and the combined information $I_d$) improves. The paper further quantifies the gain from hypergraphs and provides an information-theoretic lower bound that matches the algorithmic threshold in the symmetric setting, supported by synthetic and semi-real experiments. Overall, the results establish both the utility of hypergraphs in matrix completion and the near-optimal sample efficiency of MCH for exact recovery in structured social-data settings.
Abstract
This paper considers the problem of completing a rating matrix based on sub-sampled matrix entries as well as observed social graphs and hypergraphs. We show that there exists a \emph{sharp threshold} on the sample probability for the task of exactly completing the rating matrix -- the task is achievable when the sample probability is above the threshold, and is impossible otherwise -- demonstrating a phase transition phenomenon. The threshold can be expressed as a function of the ``quality'' of hypergraphs, enabling us to \emph{quantify} the amount of reduction in sample probability due to the exploitation of hypergraphs. This also highlights the usefulness of hypergraphs in the matrix completion problem. En route to discovering the sharp threshold, we develop a computationally efficient matrix completion algorithm that effectively exploits the observed graphs and hypergraphs. Theoretical analyses show that our algorithm succeeds with high probability as long as the sample probability exceeds the aforementioned threshold, and this theoretical result is further validated by synthetic experiments. Moreover, our experiments on a real social network dataset (with both graphs and hypergraphs) show that our algorithm outperforms other state-of-the-art matrix completion algorithms.
