Randomized Approach to Matrix Completion: Applications in Collaborative Filtering and Image Inpainting
Antonina Krajewska, Ewa Niewiadomska-Szynkiewicz
TL;DR
The paper tackles matrix completion for tall, incomplete matrices by introducing Columns Selected Matrix Completion (CSMC), a two-stage framework that first completes a reduced column-submatrix and then recovers the full matrix via a least-squares step. Theoretical guarantees show exact recovery with high probability under standard incoherence when the number of sampled columns satisfies $d = \mathcal{O}(r \log r)$ and the observed entries satisfy $|\Omega| = \mathcal{O}(n_2 r \log(n_2 r))$. Two scalable algorithms, CSNN and CSPGD, are proposed to accommodate different problem sizes, with implementations and open-source code provided. Empirical results on synthetic data, movie-rating datasets, and image inpainting demonstrate that CSMC achieves reconstruction quality comparable to state-of-the-art convex MC methods while significantly reducing computational runtime, highlighting its practical value for large-scale, imbalanced matrix problems.
Abstract
We present a novel method for matrix completion, specifically designed for matrices where one dimension significantly exceeds the other. Our Columns Selected Matrix Completion (CSMC) method combines Column Subset Selection and Low-Rank Matrix Completion to efficiently reconstruct incomplete datasets. In each step, CSMC solves a convex optimization problem. We introduce two algorithms to implement CSMC, each tailored to problems of different sizes. A formal analysis is provided, outlining the necessary assumptions and the probability of obtaining a correct solution. To assess the impact of matrix size, rank, and the ratio of missing entries on solution quality and computation time, we conducted experiments on synthetic data. The method was also applied to two real-world problems: recommendation systems and image inpainting. Our results show that CSMC provides solutions of the same quality as state-of-the-art matrix completion algorithms based on convex optimization, while achieving significant reductions in computational runtime.
