Table of Contents
Fetching ...

OnePiece: The Great Route to Generative Recommendation -- A Case Study from Tencent Algorithm Competition

Jiangxia Cao, Shuo Yang, Zijun Wang, Qinghai Tan

TL;DR

The paper investigates scaling laws in generative recommender systems by unifying retrieval and generation within a single encoder–decoder backbone. It introduces a Semantic Tokenizer with Collaborative Residual K-means to produce SID codes and a cascade inference pipeline that combines SID beam-search with InfoNCE-based scoring, trained under a joint objective. Empirical results show both SID-based generative losses and embedding-based InfoNCE losses follow power-law scaling with high fit (R^2>0.9), with deeper architectures delivering stronger ranking signals. The work demonstrates a scalable, efficient approach for industrial-scale generative recommendations and highlights directions toward billion-parameter multi-modal backbones and end-to-end differentiable optimization.

Abstract

In past years, the OpenAI's Scaling-Laws shows the amazing intelligence with the next-token prediction paradigm in neural language modeling, which pointing out a free-lunch way to enhance the model performance by scaling the model parameters. In RecSys, the retrieval stage is also follows a 'next-token prediction' paradigm, to recall the hunderds of items from the global item set, thus the generative recommendation usually refers specifically to the retrieval stage (without Tree-based methods). This raises a philosophical question: without a ground-truth next item, does the generative recommendation also holds a potential scaling law? In retrospect, the generative recommendation has two different technique paradigms: (1) ANN-based framework, utilizing the compressed user embedding to retrieve nearest other items in embedding space, e.g, Kuaiformer. (2) Auto-regressive-based framework, employing the beam search to decode the item from whole space, e.g, OneRec. In this paper, we devise a unified encoder-decoder framework to validate their scaling-laws at same time. Our empirical finding is that both of their losses strictly adhere to power-law Scaling Laws ($R^2$>0.9) within our unified architecture.

OnePiece: The Great Route to Generative Recommendation -- A Case Study from Tencent Algorithm Competition

TL;DR

The paper investigates scaling laws in generative recommender systems by unifying retrieval and generation within a single encoder–decoder backbone. It introduces a Semantic Tokenizer with Collaborative Residual K-means to produce SID codes and a cascade inference pipeline that combines SID beam-search with InfoNCE-based scoring, trained under a joint objective. Empirical results show both SID-based generative losses and embedding-based InfoNCE losses follow power-law scaling with high fit (R^2>0.9), with deeper architectures delivering stronger ranking signals. The work demonstrates a scalable, efficient approach for industrial-scale generative recommendations and highlights directions toward billion-parameter multi-modal backbones and end-to-end differentiable optimization.

Abstract

In past years, the OpenAI's Scaling-Laws shows the amazing intelligence with the next-token prediction paradigm in neural language modeling, which pointing out a free-lunch way to enhance the model performance by scaling the model parameters. In RecSys, the retrieval stage is also follows a 'next-token prediction' paradigm, to recall the hunderds of items from the global item set, thus the generative recommendation usually refers specifically to the retrieval stage (without Tree-based methods). This raises a philosophical question: without a ground-truth next item, does the generative recommendation also holds a potential scaling law? In retrospect, the generative recommendation has two different technique paradigms: (1) ANN-based framework, utilizing the compressed user embedding to retrieve nearest other items in embedding space, e.g, Kuaiformer. (2) Auto-regressive-based framework, employing the beam search to decode the item from whole space, e.g, OneRec. In this paper, we devise a unified encoder-decoder framework to validate their scaling-laws at same time. Our empirical finding is that both of their losses strictly adhere to power-law Scaling Laws (>0.9) within our unified architecture.

Paper Structure

This paper contains 20 sections, 4 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The InfoNCE/Semantic ID prediction Scaling Laws we tested in TencentGR-100M dataset, first epoch.
  • Figure 2: The training workflow of OnePiece.
  • Figure 3: Gini coefficient over training steps for all MoE layers. The rapid drop and stabilization at low values (0.1--0.4) indicate successful expert load balancing.
  • Figure 4: The hybrid inference pipeline of OnePiece(SID beam search + InfoNCE Scoring).