Table of Contents
Fetching ...

Breaking the Likelihood Trap: Consistent Generative Recommendation with Graph-structured Model

Qiya Yang, Xiaoxi Liang, Zeping Xiao, Yingjie Deng, Yalong Wang, Yongqi Liu, Han Li

TL;DR

This work tackles the likelihood trap in generative reranking for recommender systems by introducing Congrats, a graph-structured, non-autoregressive framework. It expands the decoding space with a directed acyclic graph where position embeddings become vertices and learns a vertex-transition matrix to sample diverse, coherent sequences, yielding $P_g$ over an enlarged space. A differentiable cascade with a Three-tower PLE evaluator aligns generator training with actual user preferences via $L_{total}=L_{con}+\alpha\,L_{gen}$ and a differentiable approximation $\hat{P}_g$ via the Gumbel-Softmax trick. Extensive offline and online experiments on Kuaishou and Avito demonstrate significant gains in both accuracy (Recall/AUC/NDCG) and diversity (item coverage, distinct-2) while maintaining real-time capability, confirming practical viability for large-scale industrial deployment. The approach offers a principled path to balance efficiency, quality, and diversity in modern recommender systems.

Abstract

Reranking, as the final stage of recommender systems, demands real-time inference, accuracy, and diversity. It plays a crucial role in determining the final exposure, directly influencing user experience. Recently, generative reranking has gained increasing attention for its strong ability to model complex dependencies among items. However, most existing methods suffer from the "likelihood trap", where high-likelihood sequences are often perceived as low-quality by humans. These models tend to repeatedly recommend a set of high-frequency items, resulting in list homogeneity, thereby limiting user engagement. In this work, we propose Consistent Graph-structured Generative Recommendation (Congrats), a novel generative reranking framework. To break the likelihood trap, we introduce a novel graph-structured decoder that can capture diverse sequences along multiple paths. This design not only expands the decoding space to promote diversity, but also improves prediction accuracy by implicit item dependencies derived from vertex transitions. Furthermore, we design a differentiable cascade system that incorporates an evaluator, enabling the model to learn directly from user preferences as the training objective. Extensive offline experiments validate the superior performance of Congrats over state-of-the-art reranking methods. Moreover, Congrats has been evaluated on a large-scale video-sharing app, Kuaishou, with over 300 million daily active users, demonstrating that our approach significantly improves both recommendation quality and diversity, validating our effectiveness in practical industrial environments.

Breaking the Likelihood Trap: Consistent Generative Recommendation with Graph-structured Model

TL;DR

This work tackles the likelihood trap in generative reranking for recommender systems by introducing Congrats, a graph-structured, non-autoregressive framework. It expands the decoding space with a directed acyclic graph where position embeddings become vertices and learns a vertex-transition matrix to sample diverse, coherent sequences, yielding over an enlarged space. A differentiable cascade with a Three-tower PLE evaluator aligns generator training with actual user preferences via and a differentiable approximation via the Gumbel-Softmax trick. Extensive offline and online experiments on Kuaishou and Avito demonstrate significant gains in both accuracy (Recall/AUC/NDCG) and diversity (item coverage, distinct-2) while maintaining real-time capability, confirming practical viability for large-scale industrial deployment. The approach offers a principled path to balance efficiency, quality, and diversity in modern recommender systems.

Abstract

Reranking, as the final stage of recommender systems, demands real-time inference, accuracy, and diversity. It plays a crucial role in determining the final exposure, directly influencing user experience. Recently, generative reranking has gained increasing attention for its strong ability to model complex dependencies among items. However, most existing methods suffer from the "likelihood trap", where high-likelihood sequences are often perceived as low-quality by humans. These models tend to repeatedly recommend a set of high-frequency items, resulting in list homogeneity, thereby limiting user engagement. In this work, we propose Consistent Graph-structured Generative Recommendation (Congrats), a novel generative reranking framework. To break the likelihood trap, we introduce a novel graph-structured decoder that can capture diverse sequences along multiple paths. This design not only expands the decoding space to promote diversity, but also improves prediction accuracy by implicit item dependencies derived from vertex transitions. Furthermore, we design a differentiable cascade system that incorporates an evaluator, enabling the model to learn directly from user preferences as the training objective. Extensive offline experiments validate the superior performance of Congrats over state-of-the-art reranking methods. Moreover, Congrats has been evaluated on a large-scale video-sharing app, Kuaishou, with over 300 million daily active users, demonstrating that our approach significantly improves both recommendation quality and diversity, validating our effectiveness in practical industrial environments.

Paper Structure

This paper contains 14 sections, 18 equations, 3 figures, 8 tables, 1 algorithm.

Figures (3)

  • Figure 1: An intuitive understanding of the likelihood trap. Previous generative reranking methods (left) tend to repeatedly recommend high-frequency ("hot") items, leading to homogeneous recommendations. In contrast, our proposed Congrats effectively generates diverse sequences (right), thereby achieving more personalized recommendations.
  • Figure 2: Overview of our proposed Congrats, including the graph-structured model and the consistent differentiable training. In practice, we set the number of position vertices to 24. Note that both the vertex transition and sequence generation processes are in parallel, which can be deployed in a real-time industry platform.
  • Figure 3: Effect of graph size factor ($\lambda$) on Recall@6. Increasing $\lambda$ improves recall up to $\lambda = 6$, after which the performance slightly drops. The best balance between accuracy and computation cost is achieved at $\lambda = 4$.