Table of Contents
Fetching ...

Semantics Meet Signals: Dual Codebook Representationl Learning for Generative Recommendation

Zheng Hui, Xiaokai Wei, Reza Shirkavand, Chen Wang, Weizhi Zhang, Alejandro Peláez, Michelle Gong

TL;DR

Generative recommender systems suffer from representation entanglement when using a single codebook and from static capacity allocation that hurts tail-item generalization. The authors propose FlexCode, a dual-codebook architecture with a popularity-aware MoE router that allocates a fixed token budget $L$ between a Collaborative Codebook $C_{\mathrm{CF}}$ and a Semantic Codebook $C_{\mathrm{SEM}}$, plus cross-codebook alignment and regularizers. They jointly train semantic and collaborative encoders, the cross-codebook alignment, and the autoregressive generator, achieving improved Recall@K and NDCG@K on public benchmarks and industrial data, especially for tail items. This approach demonstrates a practical path toward balancing memorization and generalization in token-based generative recommenders and scales to real-world deployment.

Abstract

Generative recommendation has recently emerged as a powerful paradigm that unifies retrieval and generation, representing items as discrete semantic tokens and enabling flexible sequence modeling with autoregressive models. Despite its success, existing approaches rely on a single, uniform codebook to encode all items, overlooking the inherent imbalance between popular items rich in collaborative signals and long-tail items that depend on semantic understanding. We argue that this uniform treatment limits representational efficiency and hinders generalization. To address this, we introduce FlexCode, a popularity-aware framework that adaptively allocates a fixed token budget between a collaborative filtering (CF) codebook and a semantic codebook. A lightweight MoE dynamically balances CF-specific precision and semantic generalization, while an alignment and smoothness objective maintains coherence across the popularity spectrum. We perform experiments on both public and industrial-scale datasets, showing that FlexCode consistently outperform strong baselines. FlexCode provides a new mechanism for token representation in generative recommenders, achieving stronger accuracy and tail robustness, and offering a new perspective on balancing memorization and generalization in token-based recommendation models.

Semantics Meet Signals: Dual Codebook Representationl Learning for Generative Recommendation

TL;DR

Generative recommender systems suffer from representation entanglement when using a single codebook and from static capacity allocation that hurts tail-item generalization. The authors propose FlexCode, a dual-codebook architecture with a popularity-aware MoE router that allocates a fixed token budget between a Collaborative Codebook and a Semantic Codebook , plus cross-codebook alignment and regularizers. They jointly train semantic and collaborative encoders, the cross-codebook alignment, and the autoregressive generator, achieving improved Recall@K and NDCG@K on public benchmarks and industrial data, especially for tail items. This approach demonstrates a practical path toward balancing memorization and generalization in token-based generative recommenders and scales to real-world deployment.

Abstract

Generative recommendation has recently emerged as a powerful paradigm that unifies retrieval and generation, representing items as discrete semantic tokens and enabling flexible sequence modeling with autoregressive models. Despite its success, existing approaches rely on a single, uniform codebook to encode all items, overlooking the inherent imbalance between popular items rich in collaborative signals and long-tail items that depend on semantic understanding. We argue that this uniform treatment limits representational efficiency and hinders generalization. To address this, we introduce FlexCode, a popularity-aware framework that adaptively allocates a fixed token budget between a collaborative filtering (CF) codebook and a semantic codebook. A lightweight MoE dynamically balances CF-specific precision and semantic generalization, while an alignment and smoothness objective maintains coherence across the popularity spectrum. We perform experiments on both public and industrial-scale datasets, showing that FlexCode consistently outperform strong baselines. FlexCode provides a new mechanism for token representation in generative recommenders, achieving stronger accuracy and tail robustness, and offering a new perspective on balancing memorization and generalization in token-based recommendation models.

Paper Structure

This paper contains 28 sections, 17 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Overview of the FlexCode framework for generative recommendation. Each item is encoded by a dual codebook with collaborative and semantic codebooks, aligned via a cross-codebook contrastive objective. A popularity-aware Mixture-of-Experts (MoE) router adaptively allocates the budget between codebooks, and an autoregressive Transformer is trained on the resulting sequences to generate items.
  • Figure 2: Performance evaluation on the large-scale industrial dataset.