Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation
Jiakai Tang, Sunhao Dai, Teng Shi, Jun Xu, Xu Chen, Wen Chen, Jian Wu, Yuning Jiang
TL;DR
This work introduces ReaRec, an inference-time reasoning framework for sequential recommendation that augments traditional forward-only models with multi-step implicit reasoning. By injecting reasoning steps through reasoning position embeddings and employing ERL and PRL, the approach yields robust gains across diverse backbones and datasets, notably boosting performance for long-tail users and items while keeping latency modest. Extensive experiments demonstrate consistent improvements (often 7–12% on average) and reveal that the optimal reasoning depth is typically modest (around two steps), with post-hoc analyses indicating substantial potential for lifting the performance ceiling. The study highlights a promising new direction toward depth-aware, inference-time computation in recommender systems and suggests avenues for adaptive, efficient reasoning in deployment scenarios.
Abstract
Sequential Recommendation (SeqRec) aims to predict the next item by capturing sequential patterns from users' historical interactions, playing a crucial role in many real-world recommender systems. However, existing approaches predominantly adopt a direct forward computation paradigm, where the final hidden state of the sequence encoder serves as the user representation. We argue that this inference paradigm, due to its limited computational depth, struggles to model the complex evolving nature of user preferences and lacks a nuanced understanding of long-tail items, leading to suboptimal performance. To address this issue, we propose \textbf{ReaRec}, the first inference-time computing framework for recommender systems, which enhances user representations through implicit multi-step reasoning. Specifically, ReaRec autoregressively feeds the sequence's last hidden state into the sequential recommender while incorporating special reasoning position embeddings to decouple the original item encoding space from the multi-step reasoning space. Moreover, we introduce two lightweight reasoning-based learning methods, Ensemble Reasoning Learning (ERL) and Progressive Reasoning Learning (PRL), to further effectively exploit ReaRec's reasoning potential. Extensive experiments on five public real-world datasets and different SeqRec architectures demonstrate the generality and effectiveness of our proposed ReaRec. Remarkably, post-hoc analyses reveal that ReaRec significantly elevates the performance ceiling of multiple sequential recommendation backbones by approximately 30\%-50\%. Thus, we believe this work can open a new and promising avenue for future research in inference-time computing for sequential recommendation.
