RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems
Junhua Liu, Yang Jihao, Cheng Chang, Kunrong LI, Bin Fu, Kwan Hui Lim
TL;DR
This work tackles proactive intent prediction in zero-query e-commerce chatbots by addressing a semantic gap between discrete user signals and textual KB intents, and an objective misalignment between general LLM outputs and ranking goals. It introduces RGAlign-Rec, a closed-loop framework that couples an LLM-based semantic reasoner with a QE-Rec three-tower ranking model and a Ranking-Guided Alignment (RGA) mechanism. The approach uses Last Token Pooling to obtain a high-fidelity semantic embedding, trains a stable reward model, and then applies multi-LLM distillation and contrastive, representation-level alignment (RG-CL) to harmonize the semantic and ranking spaces, followed by closed-loop calibration. Online and offline experiments on Shopee data show consistent improvements in GAUC, Recall, and CTR, confirming that ranking-aware alignment of semantic reasoning yields practical gains for production-scale proactive recommendation systems.
Abstract
Proactive intent prediction is a critical capability in modern e-commerce chatbots, enabling "zero-query" recommendations by anticipating user needs from behavioral and contextual signals. However, existing industrial systems face two fundamental challenges: (1) the semantic gap between discrete user features and the semantic intents within the chatbot's Knowledge Base, and (2) the objective misalignment between general-purpose LLM outputs and task-specific ranking utilities. To address these issues, we propose RGAlign-Rec, a closed-loop alignment framework that integrates an LLM-based semantic reasoner with a Query-Enhanced (QE) ranking model. We also introduce Ranking-Guided Alignment (RGA), a multi-stage training paradigm that utilizes downstream ranking signals as feedback to refine the LLM's latent reasoning. Extensive experiments on a large-scale industrial dataset from Shopee demonstrate that RGAlign-Rec achieves a 0.12% gain in GAUC, leading to a significant 3.52% relative reduction in error rate, and a 0.56% improvement in Recall@3. Online A/B testing further validates the cumulative effectiveness of our framework: the Query-Enhanced model (QE-Rec) initially yields a 0.98% improvement in CTR, while the subsequent Ranking-Guided Alignment stage contributes an additional 0.13% gain. These results indicate that ranking-aware alignment effectively synchronizes semantic reasoning with ranking objectives, significantly enhancing both prediction accuracy and service quality in real-world proactive recommendation systems.
