Table of Contents
Fetching ...

Towards Scalable Semantic Representation for Recommendation

Taolin Zhang, Junwei Pan, Jinpeng Wang, Yaohua Zha, Tao Dai, Bin Chen, Ruisheng Luo, Xiaoxiang Deng, Yuan Wang, Ming Yue, Jie Jiang, Shu-Tao Xia

TL;DR

This work tackles the challenge of transferring rich semantic information from high-dimensional LLM embeddings to low-dimensional recommendation ID spaces. It introduces Mixture-of-Codes (MoC), a two-stage framework that uses multiple parallel codebooks to quantize LLM embeddings and a downstream fusion network to implicitly combine the resulting Semantic IDs for recommendation tasks. Empirical results across three Amazon domains and multiple CTR models show that MoC outperforms single-code and hierarchical baselines in terms of discriminability and dimension robustness, with clear scaling advantages as the representation size increases. The approach enables scalable, robust semantic representations for recommendations, offering improvements in both predictive performance and information preservation when expanding semantic representation capacity.

Abstract

With recent advances in large language models (LLMs), there has been emerging numbers of research in developing Semantic IDs based on LLMs to enhance the performance of recommendation systems. However, the dimension of these embeddings needs to match that of the ID embedding in recommendation, which is usually much smaller than the original length. Such dimension compression results in inevitable losses in discriminability and dimension robustness of the LLM embeddings, which motivates us to scale up the semantic representation. In this paper, we propose Mixture-of-Codes, which first constructs multiple independent codebooks for LLM representation in the indexing stage, and then utilizes the Semantic Representation along with a fusion module for the downstream recommendation stage. Extensive analysis and experiments demonstrate that our method achieves superior discriminability and dimension robustness scalability, leading to the best scale-up performance in recommendations.

Towards Scalable Semantic Representation for Recommendation

TL;DR

This work tackles the challenge of transferring rich semantic information from high-dimensional LLM embeddings to low-dimensional recommendation ID spaces. It introduces Mixture-of-Codes (MoC), a two-stage framework that uses multiple parallel codebooks to quantize LLM embeddings and a downstream fusion network to implicitly combine the resulting Semantic IDs for recommendation tasks. Empirical results across three Amazon domains and multiple CTR models show that MoC outperforms single-code and hierarchical baselines in terms of discriminability and dimension robustness, with clear scaling advantages as the representation size increases. The approach enables scalable, robust semantic representations for recommendations, offering improvements in both predictive performance and information preservation when expanding semantic representation capacity.

Abstract

With recent advances in large language models (LLMs), there has been emerging numbers of research in developing Semantic IDs based on LLMs to enhance the performance of recommendation systems. However, the dimension of these embeddings needs to match that of the ID embedding in recommendation, which is usually much smaller than the original length. Such dimension compression results in inevitable losses in discriminability and dimension robustness of the LLM embeddings, which motivates us to scale up the semantic representation. In this paper, we propose Mixture-of-Codes, which first constructs multiple independent codebooks for LLM representation in the indexing stage, and then utilizes the Semantic Representation along with a fusion module for the downstream recommendation stage. Extensive analysis and experiments demonstrate that our method achieves superior discriminability and dimension robustness scalability, leading to the best scale-up performance in recommendations.

Paper Structure

This paper contains 18 sections, 9 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Reconstruction Error with different sets of Semantic IDs.
  • Figure 2: Scalability on discriminability of various methods.
  • Figure 3: Normalized Mutual Information(NMI) of Semantic Representation with 7x scaling factor.
  • Figure 4: Scalability of Dimension Robustness regarding different scaling factors. Each figure presents the singular spectrum of the semantic representation at the given scaling factor.
  • Figure 5: Comparison among Multi-Embedding VQ, RQ-VAE and Mixture-of-Codes. The codebooks with deeper color contain more information relevant to the input data. (a) Multi-Embedding VQ builds independent embeddings for a single set of semantic IDs and is equivalent to perform index copying for downstream models. (b) RQ-VAE utilizes hierarchical codebooks and high-level semantic IDs are less informative. (c) Our Mixture-of-Codes uses parallel codebooks to capture important semantics in the original LLM space and employs a fusion network for better generalization in downstream tasks.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Definition 3.1: Discriminability Scalability of Semantic Representation
  • Definition 3.2: Dimension Robustness Scalability of Semantic Representation