Table of Contents
Fetching ...

RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

Junhua Liu, Yang Jihao, Cheng Chang, Kunrong LI, Bin Fu, Kwan Hui Lim

TL;DR

This work tackles proactive intent prediction in zero-query e-commerce chatbots by addressing a semantic gap between discrete user signals and textual KB intents, and an objective misalignment between general LLM outputs and ranking goals. It introduces RGAlign-Rec, a closed-loop framework that couples an LLM-based semantic reasoner with a QE-Rec three-tower ranking model and a Ranking-Guided Alignment (RGA) mechanism. The approach uses Last Token Pooling to obtain a high-fidelity semantic embedding, trains a stable reward model, and then applies multi-LLM distillation and contrastive, representation-level alignment (RG-CL) to harmonize the semantic and ranking spaces, followed by closed-loop calibration. Online and offline experiments on Shopee data show consistent improvements in GAUC, Recall, and CTR, confirming that ranking-aware alignment of semantic reasoning yields practical gains for production-scale proactive recommendation systems.

Abstract

Proactive intent prediction is a critical capability in modern e-commerce chatbots, enabling "zero-query" recommendations by anticipating user needs from behavioral and contextual signals. However, existing industrial systems face two fundamental challenges: (1) the semantic gap between discrete user features and the semantic intents within the chatbot's Knowledge Base, and (2) the objective misalignment between general-purpose LLM outputs and task-specific ranking utilities. To address these issues, we propose RGAlign-Rec, a closed-loop alignment framework that integrates an LLM-based semantic reasoner with a Query-Enhanced (QE) ranking model. We also introduce Ranking-Guided Alignment (RGA), a multi-stage training paradigm that utilizes downstream ranking signals as feedback to refine the LLM's latent reasoning. Extensive experiments on a large-scale industrial dataset from Shopee demonstrate that RGAlign-Rec achieves a 0.12% gain in GAUC, leading to a significant 3.52% relative reduction in error rate, and a 0.56% improvement in Recall@3. Online A/B testing further validates the cumulative effectiveness of our framework: the Query-Enhanced model (QE-Rec) initially yields a 0.98% improvement in CTR, while the subsequent Ranking-Guided Alignment stage contributes an additional 0.13% gain. These results indicate that ranking-aware alignment effectively synchronizes semantic reasoning with ranking objectives, significantly enhancing both prediction accuracy and service quality in real-world proactive recommendation systems.

RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

TL;DR

This work tackles proactive intent prediction in zero-query e-commerce chatbots by addressing a semantic gap between discrete user signals and textual KB intents, and an objective misalignment between general LLM outputs and ranking goals. It introduces RGAlign-Rec, a closed-loop framework that couples an LLM-based semantic reasoner with a QE-Rec three-tower ranking model and a Ranking-Guided Alignment (RGA) mechanism. The approach uses Last Token Pooling to obtain a high-fidelity semantic embedding, trains a stable reward model, and then applies multi-LLM distillation and contrastive, representation-level alignment (RG-CL) to harmonize the semantic and ranking spaces, followed by closed-loop calibration. Online and offline experiments on Shopee data show consistent improvements in GAUC, Recall, and CTR, confirming that ranking-aware alignment of semantic reasoning yields practical gains for production-scale proactive recommendation systems.

Abstract

Proactive intent prediction is a critical capability in modern e-commerce chatbots, enabling "zero-query" recommendations by anticipating user needs from behavioral and contextual signals. However, existing industrial systems face two fundamental challenges: (1) the semantic gap between discrete user features and the semantic intents within the chatbot's Knowledge Base, and (2) the objective misalignment between general-purpose LLM outputs and task-specific ranking utilities. To address these issues, we propose RGAlign-Rec, a closed-loop alignment framework that integrates an LLM-based semantic reasoner with a Query-Enhanced (QE) ranking model. We also introduce Ranking-Guided Alignment (RGA), a multi-stage training paradigm that utilizes downstream ranking signals as feedback to refine the LLM's latent reasoning. Extensive experiments on a large-scale industrial dataset from Shopee demonstrate that RGAlign-Rec achieves a 0.12% gain in GAUC, leading to a significant 3.52% relative reduction in error rate, and a 0.56% improvement in Recall@3. Online A/B testing further validates the cumulative effectiveness of our framework: the Query-Enhanced model (QE-Rec) initially yields a 0.98% improvement in CTR, while the subsequent Ranking-Guided Alignment stage contributes an additional 0.13% gain. These results indicate that ranking-aware alignment effectively synchronizes semantic reasoning with ranking objectives, significantly enhancing both prediction accuracy and service quality in real-world proactive recommendation systems.
Paper Structure (20 sections, 13 equations, 3 figures, 5 tables)

This paper contains 20 sections, 13 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: Challenges of Semantic Gap and Objective Misalignment
  • Figure 2: RGAlign-Rec Framework and Online Inference Solution
  • Figure 3: RGAlign-Rec Training Pipeline