Table of Contents
Fetching ...

Next Interest Flow: A Generative Pre-training Paradigm for Recommender Systems by Modeling All-domain Movelines

Chen Gao, Zixin Zhao, Lv Shao, Tong Liu

TL;DR

This work addresses CTR prediction by reframing user intent as a proactive generation task. It introduces Next Interest Flow, a dense vector representation of future interest, learned through a two-stage AMEN framework: Stage 1 generative pre-training with a Transformer-based decoder $G_{\phi}$ predicting $\mathbf{F}$ and trained with InfoNCE plus diversity and velocity regularizers, and Stage 2 discriminative fine-tuning where a frozen $G_{\phi}$ provides forward-looking features to $F_{\theta}$, enhanced by a Semantic Alignment Module and a Temporal Sequential Pairwise (TSP) auxiliary task. The approach solves the objective mismatch between stages via cross-stage weight initialization and semantic alignment, and further strengthens the discriminative model with temporal causality through TSP. Extensive offline experiments show AMEN outperforming strong baselines (e.g., AUC $0.7708$, +$0.87$ over MEDN, +$1.43$ over P5, +$0.70$ over TIGER), while online A/B testing on Taobao reports a +11.6% lift in post-view CTCVR, underscoring practical impact. Overall, AMEN demonstrates that proactive, all-domain intent modeling with a unified generative-discriminative pipeline can achieve substantial gains in real-world recommender systems.

Abstract

Click-Through Rate (CTR) prediction, a cornerstone of modern recommender systems, has been dominated by discriminative models that react to past user behavior rather than proactively modeling user intent. Existing generative paradigms attempt to address this but suffer from critical limitations: Large Language Model (LLM) based methods create a semantic mismatch by forcing e-commerce signals into a linguistic space, while ID-based generation is constrained by item memorization and cold-start issues. To overcome these limitations, we propose a novel generative pre-training paradigm. Our model learns to predict the Next Interest Flow, a dense vector sequence representing a user's future intent, while simultaneously modeling its internal Interest Diversity and Interest Evolution Velocity to ensure the representation is both rich and coherent. However, this two-stage approach introduces a critical objective mismatch between the generative and discriminative stages. We resolve this via a bidirectional alignment strategy, which harmonizes the two stages through cross-stage weight initialization and a dynamic Semantic Alignment Module for fine-tuning. Additionally, we enhance the underlying discriminative model with a Temporal Sequential Pairwise (TSP) mechanism to better capture temporal causality. We present the All-domain Moveline Evolution Network (AMEN), a unified framework implementing our entire pipeline. Extensive offline experiments validate AMEN's superiority over strong baselines, and a large-scale online A/B test demonstrates its significant real-world impact, delivering substantial improvements in key business metrics.

Next Interest Flow: A Generative Pre-training Paradigm for Recommender Systems by Modeling All-domain Movelines

TL;DR

This work addresses CTR prediction by reframing user intent as a proactive generation task. It introduces Next Interest Flow, a dense vector representation of future interest, learned through a two-stage AMEN framework: Stage 1 generative pre-training with a Transformer-based decoder predicting and trained with InfoNCE plus diversity and velocity regularizers, and Stage 2 discriminative fine-tuning where a frozen provides forward-looking features to , enhanced by a Semantic Alignment Module and a Temporal Sequential Pairwise (TSP) auxiliary task. The approach solves the objective mismatch between stages via cross-stage weight initialization and semantic alignment, and further strengthens the discriminative model with temporal causality through TSP. Extensive offline experiments show AMEN outperforming strong baselines (e.g., AUC , + over MEDN, + over P5, + over TIGER), while online A/B testing on Taobao reports a +11.6% lift in post-view CTCVR, underscoring practical impact. Overall, AMEN demonstrates that proactive, all-domain intent modeling with a unified generative-discriminative pipeline can achieve substantial gains in real-world recommender systems.

Abstract

Click-Through Rate (CTR) prediction, a cornerstone of modern recommender systems, has been dominated by discriminative models that react to past user behavior rather than proactively modeling user intent. Existing generative paradigms attempt to address this but suffer from critical limitations: Large Language Model (LLM) based methods create a semantic mismatch by forcing e-commerce signals into a linguistic space, while ID-based generation is constrained by item memorization and cold-start issues. To overcome these limitations, we propose a novel generative pre-training paradigm. Our model learns to predict the Next Interest Flow, a dense vector sequence representing a user's future intent, while simultaneously modeling its internal Interest Diversity and Interest Evolution Velocity to ensure the representation is both rich and coherent. However, this two-stage approach introduces a critical objective mismatch between the generative and discriminative stages. We resolve this via a bidirectional alignment strategy, which harmonizes the two stages through cross-stage weight initialization and a dynamic Semantic Alignment Module for fine-tuning. Additionally, we enhance the underlying discriminative model with a Temporal Sequential Pairwise (TSP) mechanism to better capture temporal causality. We present the All-domain Moveline Evolution Network (AMEN), a unified framework implementing our entire pipeline. Extensive offline experiments validate AMEN's superiority over strong baselines, and a large-scale online A/B test demonstrates its significant real-world impact, delivering substantial improvements in key business metrics.

Paper Structure

This paper contains 10 sections, 10 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: The overall architecture of the All-domain Moveline Evolution Network (AMEN). (a) Stage 1: Generative Pre-training. (b) Stage 2: Discriminative Fine-tuning. (c) Discriminator Enhancement: Temporal Sequential Pairwise (TSP) Task.
  • Figure 2: Visualization of the information decoded from the Next Interest Flow.
  • Figure 3: Probability density distributions of the TSP calibration score ($c_{\text{tsp}}$).