Efficient Model-Based Collaborative Filtering with Fast Adaptive PCA

Xiangyun Ding; Wenjian Yu; Yuyang Xie; Shenghua Liu

Efficient Model-Based Collaborative Filtering with Fast Adaptive PCA

Xiangyun Ding, Wenjian Yu, Yuyang Xie, Shenghua Liu

TL;DR

This work tackles efficient model-based collaborative filtering for matrix completion by introducing a fast adaptive PCA framework that automatically selects the latent dimensionality $k$. By integrating a novel termination mechanism with a fixed-precision randomized QB factorization, the method achieves near-optimal prediction accuracy with substantial runtime gains on large sparse datasets. Empirical results show acceleration up to several-fold over baseline randomized SVD and dense SVD methods, and that the CF pipeline—with automatic $k$ determination—consistently outperforms traditional SVD-based CF, RMF, and fast SVT, while remaining significantly more computationally efficient than deep-learning approaches. The approach is parameter-free, scales to datasets like MovieLens-20M, and offers a practical balance between accuracy and efficiency for real-world recommender systems.

Abstract

A model-based collaborative filtering (CF) approach utilizing fast adaptive randomized singular value decomposition (SVD) is proposed for the matrix completion problem in recommender system. Firstly, a fast adaptive PCA frameworkis presented which combines the fixed-precision randomized matrix factorization algorithm [1] and accelerating skills for handling large sparse data. Then, a novel termination mechanism for the adaptive PCA is proposed to automatically determine a number of latent factors for achieving the near optimal prediction accuracy during the subsequent model-based CF. The resulted CF approach has good accuracy while inheriting high runtime efficiency. Experiments on real data show that, the proposed adaptive PCA is up to 2.7X and 6.7X faster than the original fixed-precision SVD approach [1] and svds in Matlab repsectively, while preserving accuracy. The proposed model-based CF approach is able to efficiently process the MovieLens data with 20M ratings and exhibits more than 10X speedup over the regularized matrix factorization based approach [2] and the fast singular value thresholding approach [3] with comparable or better accuracy. It also owns the advantage of parameter free. Compared with the deep-learning-based CF approach, the proposed approach is much more computationally efficient, with just marginal performance loss.

Efficient Model-Based Collaborative Filtering with Fast Adaptive PCA

TL;DR

Abstract

Efficient Model-Based Collaborative Filtering with Fast Adaptive PCA

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (2)