Table of Contents
Fetching ...

Efficient Model-Based Collaborative Filtering with Fast Adaptive PCA

Xiangyun Ding, Wenjian Yu, Yuyang Xie, Shenghua Liu

TL;DR

This work tackles efficient model-based collaborative filtering for matrix completion by introducing a fast adaptive PCA framework that automatically selects the latent dimensionality $k$. By integrating a novel termination mechanism with a fixed-precision randomized QB factorization, the method achieves near-optimal prediction accuracy with substantial runtime gains on large sparse datasets. Empirical results show acceleration up to several-fold over baseline randomized SVD and dense SVD methods, and that the CF pipeline—with automatic $k$ determination—consistently outperforms traditional SVD-based CF, RMF, and fast SVT, while remaining significantly more computationally efficient than deep-learning approaches. The approach is parameter-free, scales to datasets like MovieLens-20M, and offers a practical balance between accuracy and efficiency for real-world recommender systems.

Abstract

A model-based collaborative filtering (CF) approach utilizing fast adaptive randomized singular value decomposition (SVD) is proposed for the matrix completion problem in recommender system. Firstly, a fast adaptive PCA frameworkis presented which combines the fixed-precision randomized matrix factorization algorithm [1] and accelerating skills for handling large sparse data. Then, a novel termination mechanism for the adaptive PCA is proposed to automatically determine a number of latent factors for achieving the near optimal prediction accuracy during the subsequent model-based CF. The resulted CF approach has good accuracy while inheriting high runtime efficiency. Experiments on real data show that, the proposed adaptive PCA is up to 2.7X and 6.7X faster than the original fixed-precision SVD approach [1] and svds in Matlab repsectively, while preserving accuracy. The proposed model-based CF approach is able to efficiently process the MovieLens data with 20M ratings and exhibits more than 10X speedup over the regularized matrix factorization based approach [2] and the fast singular value thresholding approach [3] with comparable or better accuracy. It also owns the advantage of parameter free. Compared with the deep-learning-based CF approach, the proposed approach is much more computationally efficient, with just marginal performance loss.

Efficient Model-Based Collaborative Filtering with Fast Adaptive PCA

TL;DR

This work tackles efficient model-based collaborative filtering for matrix completion by introducing a fast adaptive PCA framework that automatically selects the latent dimensionality . By integrating a novel termination mechanism with a fixed-precision randomized QB factorization, the method achieves near-optimal prediction accuracy with substantial runtime gains on large sparse datasets. Empirical results show acceleration up to several-fold over baseline randomized SVD and dense SVD methods, and that the CF pipeline—with automatic determination—consistently outperforms traditional SVD-based CF, RMF, and fast SVT, while remaining significantly more computationally efficient than deep-learning approaches. The approach is parameter-free, scales to datasets like MovieLens-20M, and offers a practical balance between accuracy and efficiency for real-world recommender systems.

Abstract

A model-based collaborative filtering (CF) approach utilizing fast adaptive randomized singular value decomposition (SVD) is proposed for the matrix completion problem in recommender system. Firstly, a fast adaptive PCA frameworkis presented which combines the fixed-precision randomized matrix factorization algorithm [1] and accelerating skills for handling large sparse data. Then, a novel termination mechanism for the adaptive PCA is proposed to automatically determine a number of latent factors for achieving the near optimal prediction accuracy during the subsequent model-based CF. The resulted CF approach has good accuracy while inheriting high runtime efficiency. Experiments on real data show that, the proposed adaptive PCA is up to 2.7X and 6.7X faster than the original fixed-precision SVD approach [1] and svds in Matlab repsectively, while preserving accuracy. The proposed model-based CF approach is able to efficiently process the MovieLens data with 20M ratings and exhibits more than 10X speedup over the regularized matrix factorization based approach [2] and the fast singular value thresholding approach [3] with comparable or better accuracy. It also owns the advantage of parameter free. Compared with the deep-learning-based CF approach, the proposed approach is much more computationally efficient, with just marginal performance loss.

Paper Structure

This paper contains 12 sections, 1 theorem, 6 equations, 3 figures, 3 tables, 5 algorithms.

Key Result

Theorem 1

Providing that the number of iterations in the outer loop is the same and $q=2p+2$, the $\mathbf{Q}$ and $\mathbf{B}$ obtained from the adaptive PCA framework (Alg. 3) are the same as those obtained from the fixed-precision randomized QB factorization (Alg. 2) in exact arithmetic.

Figures (3)

  • Figure 1: The computed singular values for different matrices, showing the accuracy of proposed fast adaptive PCA framework for sparse data.
  • Figure 2: The MAE on validation set and test set, and the corresponding computational time of the $k$-adpative rPCA based rating prediction vs. parameter $k$. The minimal MAE on validation set occurs when $k=520$, and on test set when $k=540$. The test data is from MoiveLens-20M.
  • Figure 3: The MAE of the RMF model based collaborative filtering approach on MovieLens-20M vs. its computational time. The comparison with the proposed approach shows that the latter is 10X faster for achieving same accuracy.

Theorems & Definitions (2)

  • Theorem 1
  • proof