Breaking the Likelihood Trap: Consistent Generative Recommendation with Graph-structured Model
Qiya Yang, Xiaoxi Liang, Zeping Xiao, Yingjie Deng, Yalong Wang, Yongqi Liu, Han Li
TL;DR
This work tackles the likelihood trap in generative reranking for recommender systems by introducing Congrats, a graph-structured, non-autoregressive framework. It expands the decoding space with a directed acyclic graph where position embeddings become vertices and learns a vertex-transition matrix to sample diverse, coherent sequences, yielding $P_g$ over an enlarged space. A differentiable cascade with a Three-tower PLE evaluator aligns generator training with actual user preferences via $L_{total}=L_{con}+\alpha\,L_{gen}$ and a differentiable approximation $\hat{P}_g$ via the Gumbel-Softmax trick. Extensive offline and online experiments on Kuaishou and Avito demonstrate significant gains in both accuracy (Recall/AUC/NDCG) and diversity (item coverage, distinct-2) while maintaining real-time capability, confirming practical viability for large-scale industrial deployment. The approach offers a principled path to balance efficiency, quality, and diversity in modern recommender systems.
Abstract
Reranking, as the final stage of recommender systems, demands real-time inference, accuracy, and diversity. It plays a crucial role in determining the final exposure, directly influencing user experience. Recently, generative reranking has gained increasing attention for its strong ability to model complex dependencies among items. However, most existing methods suffer from the "likelihood trap", where high-likelihood sequences are often perceived as low-quality by humans. These models tend to repeatedly recommend a set of high-frequency items, resulting in list homogeneity, thereby limiting user engagement. In this work, we propose Consistent Graph-structured Generative Recommendation (Congrats), a novel generative reranking framework. To break the likelihood trap, we introduce a novel graph-structured decoder that can capture diverse sequences along multiple paths. This design not only expands the decoding space to promote diversity, but also improves prediction accuracy by implicit item dependencies derived from vertex transitions. Furthermore, we design a differentiable cascade system that incorporates an evaluator, enabling the model to learn directly from user preferences as the training objective. Extensive offline experiments validate the superior performance of Congrats over state-of-the-art reranking methods. Moreover, Congrats has been evaluated on a large-scale video-sharing app, Kuaishou, with over 300 million daily active users, demonstrating that our approach significantly improves both recommendation quality and diversity, validating our effectiveness in practical industrial environments.
