Table of Contents
Fetching ...

ImplicitSLIM and How it Improves Embedding-based Collaborative Filtering

Ilya Shenbin, Sergey Nikolenko

TL;DR

ImplicitSLIM addresses the challenge of improving embeddings for sparse, high-dimensional implicit-feedback data in collaborative filtering by deriving embeddings from SLIM-like item-item interactions using a memory-efficient, unsupervised approach.The method fuses ideas from EASE and Locally Linear Embeddings, introducing a closed-form second-step update via an unconstrained objective and leveraging Woodbury identities to avoid large inversions, with an auxiliary matrix A set to the embedding matrix Q.It serves as a versatile tool for initialization and regularization across MF, PLRec, VAEs, and graph-based CF models, and demonstrates significant performance and convergence advantages, including state-of-the-art results when combined with RecVAE and H+Vamp(Gated) on MovieLens-20M and Netflix Prize.Overall, ImplicitSLIM provides a generic, scalable mechanism to enrich and stabilize embeddings in a wide range of embedding-based recommender systems, with measurable gains in accuracy and efficiency.

Abstract

We present ImplicitSLIM, a novel unsupervised learning approach for sparse high-dimensional data, with applications to collaborative filtering. Sparse linear methods (SLIM) and their variations show outstanding performance, but they are memory-intensive and hard to scale. ImplicitSLIM improves embedding-based models by extracting embeddings from SLIM-like models in a computationally cheap and memory-efficient way, without explicit learning of heavy SLIM-like models. We show that ImplicitSLIM improves performance and speeds up convergence for both state of the art and classical collaborative filtering methods. The source code for ImplicitSLIM, related models, and applications is available at https://github.com/ilya-shenbin/ImplicitSLIM.

ImplicitSLIM and How it Improves Embedding-based Collaborative Filtering

TL;DR

ImplicitSLIM addresses the challenge of improving embeddings for sparse, high-dimensional implicit-feedback data in collaborative filtering by deriving embeddings from SLIM-like item-item interactions using a memory-efficient, unsupervised approach.The method fuses ideas from EASE and Locally Linear Embeddings, introducing a closed-form second-step update via an unconstrained objective and leveraging Woodbury identities to avoid large inversions, with an auxiliary matrix A set to the embedding matrix Q.It serves as a versatile tool for initialization and regularization across MF, PLRec, VAEs, and graph-based CF models, and demonstrates significant performance and convergence advantages, including state-of-the-art results when combined with RecVAE and H+Vamp(Gated) on MovieLens-20M and Netflix Prize.Overall, ImplicitSLIM provides a generic, scalable mechanism to enrich and stabilize embeddings in a wide range of embedding-based recommender systems, with measurable gains in accuracy and efficiency.

Abstract

We present ImplicitSLIM, a novel unsupervised learning approach for sparse high-dimensional data, with applications to collaborative filtering. Sparse linear methods (SLIM) and their variations show outstanding performance, but they are memory-intensive and hard to scale. ImplicitSLIM improves embedding-based models by extracting embeddings from SLIM-like models in a computationally cheap and memory-efficient way, without explicit learning of heavy SLIM-like models. We show that ImplicitSLIM improves performance and speeds up convergence for both state of the art and classical collaborative filtering methods. The source code for ImplicitSLIM, related models, and applications is available at https://github.com/ilya-shenbin/ImplicitSLIM.
Paper Structure (31 sections, 40 equations, 3 figures, 4 tables, 7 algorithms)

This paper contains 31 sections, 40 equations, 3 figures, 4 tables, 7 algorithms.

Figures (3)

  • Figure 1: ImplicitSLIM and SLIM-LLE applied to MF and PLRec (setups defined in Section \ref{['sec:eval']}); the X-axis shows embedding dimensions, the Y-axis shows NDCG@100.
  • Figure 2: Sample convergence plots for state of the art models with ImpicitSLIM; the X-axis shows training epochs; Y-axis, NDCG@20 metric on Yelp2018, NDCG@100 metric on other datasets.
  • Figure 3: Convergence plots for MF, MF + ImplicitSLIM, and EASE; the X-axis shows wall-clock time is seconds, the Y-axis shows NDCG@100; MF and MF + ImplicitSLIM have been evaluated with different embedding dimensions shown in the figure.