Table of Contents
Fetching ...

Learning Category Trees for ID-Based Recommendation: Exploring the Power of Differentiable Vector Quantization

Qijiong Liu, Lu Fan, Jiaren Xiao, Jieming Zhu, Xiao-Ming Wu

TL;DR

This work devise a differentiable vector quantization framework for automatic category tree generation, namely CAGE, which enables the simultaneous learning and refinement of categorical code representations and entity embeddings in an end-to-end manner, starting from the randomly initialized states.

Abstract

Category information plays a crucial role in enhancing the quality and personalization of recommender systems. Nevertheless, the availability of item category information is not consistently present, particularly in the context of ID-based recommendations. In this work, we propose a novel approach to automatically learn and generate entity (i.e., user or item) category trees for ID-based recommendation. Specifically, we devise a differentiable vector quantization framework for automatic category tree generation, namely CAGE, which enables the simultaneous learning and refinement of categorical code representations and entity embeddings in an end-to-end manner, starting from the randomly initialized states. With its high adaptability, CAGE can be easily integrated into both sequential and non-sequential recommender systems. We validate the effectiveness of CAGE on various recommendation tasks including list completion, collaborative filtering, and click-through rate prediction, across different recommendation models. We release the code and data for others to reproduce the reported results.

Learning Category Trees for ID-Based Recommendation: Exploring the Power of Differentiable Vector Quantization

TL;DR

This work devise a differentiable vector quantization framework for automatic category tree generation, namely CAGE, which enables the simultaneous learning and refinement of categorical code representations and entity embeddings in an end-to-end manner, starting from the randomly initialized states.

Abstract

Category information plays a crucial role in enhancing the quality and personalization of recommender systems. Nevertheless, the availability of item category information is not consistently present, particularly in the context of ID-based recommendations. In this work, we propose a novel approach to automatically learn and generate entity (i.e., user or item) category trees for ID-based recommendation. Specifically, we devise a differentiable vector quantization framework for automatic category tree generation, namely CAGE, which enables the simultaneous learning and refinement of categorical code representations and entity embeddings in an end-to-end manner, starting from the randomly initialized states. With its high adaptability, CAGE can be easily integrated into both sequential and non-sequential recommender systems. We validate the effectiveness of CAGE on various recommendation tasks including list completion, collaborative filtering, and click-through rate prediction, across different recommendation models. We release the code and data for others to reproduce the reported results.
Paper Structure (40 sections, 16 equations, 5 figures, 8 tables)

This paper contains 40 sections, 16 equations, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Illustration of our approach in learning category trees for ID-based recommendation. In contrast to traditional methods that solely offer item or user IDs to the recommender system, our approach involves implicit learning of user/item category trees. The category information, encoded as vectors, is subsequently integrated with the user/item ID and provided as input to the recommender system.
  • Figure 2: Comparison between (a) the traditional three-stage vector quantization pipeline for content-based recommendation and (b) our proposed end-to-end differential vector quantization framework for ID-based recommendation.
  • Figure 3: Overview of our proposed category tree generation framework (CAGE).
  • Figure 4: Influence of the use of user and item CAGE in the non-sequential recommenders.
  • Figure 5: Impact of the residual connection weight $\alpha$, the quantization commitment cost $\beta$, the codebook classification loss weight $\omega_\text{c}$, and the quantization loss weight $\omega_\text{q}$. We use the model with $\alpha=0$ as the reference baseline for (a), and measure the relative improvement of each metric compared to the baseline for various values of $\alpha$, defined as $(m_\alpha - m_\text{0}) / m_\text{0} * 100\%$, where $m$ is one of the metrics in {N@5, N@10, HR@5, HR@10}. Therefore, the relative improvement of $\alpha=0$ is constant at 0%. Similarly, we use the model with $\beta=0$ as the reference baseline for (b), $\omega_\text{q}=0$ for (c), and $\omega_\text{q}=0$ for (d).