TBGRecall: A Generative Retrieval Model for E-commerce Recommendation Scenarios
Zida Liang, Changfa Wu, Dunxian Huang, Weiqiang Sun, Ziyang Wang, Yuliang Yan, Jian Wu, Yuning Jiang, Bo Zheng, Ke Chen, Silu Zhou, Yu Zhang
TL;DR
This work tackles the inefficiency and misalignment of autoregressive generative models with retrieval in e-commerce. By introducing Next Session Prediction (NSP) and a session-aware, multi-component architecture (TSN, MSP, MoE) within TBGRecall, the authors convert generation into a session-level retrieval problem that aligns with ANN-based item retrieval. The framework leverages limited historical pre-training and Partial Incremental Training to achieve scalable, fast updates, supported by a tailored loss with per-scene normalization. Empirical results on RecFlow and Taobao show consistent gains over strong baselines, with notable online improvements in transaction metrics, validating the practical viability of NSP-based generative retrieval for large-scale industrial systems. The work also demonstrates clear scaling laws and provides a concrete deployment blueprint, bridging theoretical advances with real-world e-commerce needs.
Abstract
Recommendation systems are essential tools in modern e-commerce, facilitating personalized user experiences by suggesting relevant products. Recent advancements in generative models have demonstrated potential in enhancing recommendation systems; however, these models often exhibit limitations in optimizing retrieval tasks, primarily due to their reliance on autoregressive generation mechanisms. Conventional approaches introduce sequential dependencies that impede efficient retrieval, as they are inherently unsuitable for generating multiple items without positional constraints within a single request session. To address these limitations, we propose TBGRecall, a framework integrating Next Session Prediction (NSP), designed to enhance generative retrieval models for e-commerce applications. Our framework reformulation involves partitioning input samples into multi-session sequences, where each sequence comprises a session token followed by a set of item tokens, and then further incorporate multiple optimizations tailored to the generative task in retrieval scenarios. In terms of training methodology, our pipeline integrates limited historical data pre-training with stochastic partial incremental training, significantly improving training efficiency and emphasizing the superiority of data recency over sheer data volume. Our extensive experiments, conducted on public benchmarks alongside a large-scale industrial dataset from TaoBao, show TBGRecall outperforms the state-of-the-art recommendation methods, and exhibits a clear scaling law trend. Ultimately, NSP represents a significant advancement in the effectiveness of generative recommendation systems for e-commerce applications.
