Table of Contents
Fetching ...

PIT: A Dynamic Personalized Item Tokenizer for End-to-End Generative Recommendation

Huanjie Wang, Xinchen Luo, Honghui Bao, Zhang Zixing, Lejian Ren, Yunfan Wu, Hongwei Zhang, Liwei Guan, Guang Chen

TL;DR

PIT, a dynamic Personalized Item Tokenizer framework for end-to-end generative recommendation, which employs a co-generative architecture that harmonizes collaborative patterns through collaborative signal alignment and synchronizes item tokenizer with generative recommender via a co-evolution learning, enables the dynamic, joint, end-to-end evolution of both index construction and recommendation.

Abstract

Generative Recommendation has revolutionized recommender systems by reformulating retrieval as a sequence generation task over discrete item identifiers. Despite the progress, existing approaches typically rely on static, decoupled tokenization that ignores collaborative signals. While recent methods attempt to integrate collaborative signals into item identifiers either during index construction or through end-to-end modeling, they encounter significant challenges in real-world production environments. Specifically, the volatility of collaborative signals leads to unstable tokenization, and current end-to-end strategies often devolve into suboptimal two-stage training rather than achieving true co-evolution. To bridge this gap, we propose PIT, a dynamic Personalized Item Tokenizer framework for end-to-end generative recommendation, which employs a co-generative architecture that harmonizes collaborative patterns through collaborative signal alignment and synchronizes item tokenizer with generative recommender via a co-evolution learning. This enables the dynamic, joint, end-to-end evolution of both index construction and recommendation. Furthermore, a one-to-many beam index ensures scalability and robustness, facilitating seamless integration into large-scale industrial deployments. Extensive experiments on real-world datasets demonstrate that PIT consistently outperforms competitive baselines. In a large-scale deployment at Kuaishou, an online A/B test yielded a substantial 0.402% uplift in App Stay Time, validating the framework's effectiveness in dynamic industrial environments.

PIT: A Dynamic Personalized Item Tokenizer for End-to-End Generative Recommendation

TL;DR

PIT, a dynamic Personalized Item Tokenizer framework for end-to-end generative recommendation, which employs a co-generative architecture that harmonizes collaborative patterns through collaborative signal alignment and synchronizes item tokenizer with generative recommender via a co-evolution learning, enables the dynamic, joint, end-to-end evolution of both index construction and recommendation.

Abstract

Generative Recommendation has revolutionized recommender systems by reformulating retrieval as a sequence generation task over discrete item identifiers. Despite the progress, existing approaches typically rely on static, decoupled tokenization that ignores collaborative signals. While recent methods attempt to integrate collaborative signals into item identifiers either during index construction or through end-to-end modeling, they encounter significant challenges in real-world production environments. Specifically, the volatility of collaborative signals leads to unstable tokenization, and current end-to-end strategies often devolve into suboptimal two-stage training rather than achieving true co-evolution. To bridge this gap, we propose PIT, a dynamic Personalized Item Tokenizer framework for end-to-end generative recommendation, which employs a co-generative architecture that harmonizes collaborative patterns through collaborative signal alignment and synchronizes item tokenizer with generative recommender via a co-evolution learning. This enables the dynamic, joint, end-to-end evolution of both index construction and recommendation. Furthermore, a one-to-many beam index ensures scalability and robustness, facilitating seamless integration into large-scale industrial deployments. Extensive experiments on real-world datasets demonstrate that PIT consistently outperforms competitive baselines. In a large-scale deployment at Kuaishou, an online A/B test yielded a substantial 0.402% uplift in App Stay Time, validating the framework's effectiveness in dynamic industrial environments.
Paper Structure (39 sections, 13 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 39 sections, 13 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Illustration of the PIT framework comprising co-generative architecture when using user-guided minimum-loss selection mechanism in co-evolution learning.
  • Figure 2: The mechanism of beam index which shows how PIT dynamically maps multiple SIDs and item identifiers
  • Figure 3: The overall system deployment pipeline, illustrating the interplay between models training, real-time index synchronization, and online generative recommender inference.
  • Figure 4: Quantitative analysis of codebook utilization. (a) Codeword distribution across quantization layers on the Toys and Games dataset. (b) Overall comparison of average entropy across three datasets.
  • Figure 5: Architectural overview of the RQ-Kmeans initialization. The codebook is pre-trained via Item-to-Item (I2I) contrastive learning to provide a semantically rich starting point for subsequent dynamic Beam Index updates.
  • ...and 1 more figures