Does Weighting Improve Matrix Factorization for Recommender Systems?
Alex Ayoub, Samuel Robertson, Dawen Liang, Harald Steck, Nathan Kallus
TL;DR
This paper investigates whether upweighting observed interactions in implicit-feedback matrix factorization is consistently advantageous. By systematically analyzing WMF, AWMF, and full-rank (EASE-like) models under unregularized and regularized regimes, the authors derive exact closed-form solutions using vectorization and Kronecker algebra, enabling efficient optimization via preconditioned Conjugate Gradient. The key finding is that large unweighted linear models often match or outperform their weighted counterparts, challenging the conventional wisdom, while weighting can help low-capacity models depending on the regularization scheme and data. The results have practical implications for recommender system design, suggesting that practitioners should reconsider weighting strategies for high-capacity models and rely on regularization choices to balance performance and scalability.
Abstract
Matrix factorization is a widely used approach for top-N recommendation and collaborative filtering. When implemented on implicit feedback data (such as clicks), a common heuristic is to upweight the observed interactions. This strategy has been shown to improve performance for certain algorithms. In this paper, we conduct a systematic study of various weighting schemes and matrix factorization algorithms. Somewhat surprisingly, we find that training with unweighted data can perform comparably to, and sometimes outperform, training with weighted data, especially for large models. This observation challenges the conventional wisdom. Nevertheless, we identify cases where weighting can be beneficial, particularly for models with lower capacity and specific regularization schemes. We also derive efficient algorithms for exactly minimizing several weighted objectives that were previously considered computationally intractable. Our work provides a comprehensive analysis of the interplay between weighting, regularization, and model capacity in matrix factorization for recommender systems.
