Table of Contents
Fetching ...

Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation

Jiakai Tang, Sunhao Dai, Teng Shi, Jun Xu, Xu Chen, Wen Chen, Jian Wu, Yuning Jiang

TL;DR

This work introduces ReaRec, an inference-time reasoning framework for sequential recommendation that augments traditional forward-only models with multi-step implicit reasoning. By injecting reasoning steps through reasoning position embeddings and employing ERL and PRL, the approach yields robust gains across diverse backbones and datasets, notably boosting performance for long-tail users and items while keeping latency modest. Extensive experiments demonstrate consistent improvements (often 7–12% on average) and reveal that the optimal reasoning depth is typically modest (around two steps), with post-hoc analyses indicating substantial potential for lifting the performance ceiling. The study highlights a promising new direction toward depth-aware, inference-time computation in recommender systems and suggests avenues for adaptive, efficient reasoning in deployment scenarios.

Abstract

Sequential Recommendation (SeqRec) aims to predict the next item by capturing sequential patterns from users' historical interactions, playing a crucial role in many real-world recommender systems. However, existing approaches predominantly adopt a direct forward computation paradigm, where the final hidden state of the sequence encoder serves as the user representation. We argue that this inference paradigm, due to its limited computational depth, struggles to model the complex evolving nature of user preferences and lacks a nuanced understanding of long-tail items, leading to suboptimal performance. To address this issue, we propose \textbf{ReaRec}, the first inference-time computing framework for recommender systems, which enhances user representations through implicit multi-step reasoning. Specifically, ReaRec autoregressively feeds the sequence's last hidden state into the sequential recommender while incorporating special reasoning position embeddings to decouple the original item encoding space from the multi-step reasoning space. Moreover, we introduce two lightweight reasoning-based learning methods, Ensemble Reasoning Learning (ERL) and Progressive Reasoning Learning (PRL), to further effectively exploit ReaRec's reasoning potential. Extensive experiments on five public real-world datasets and different SeqRec architectures demonstrate the generality and effectiveness of our proposed ReaRec. Remarkably, post-hoc analyses reveal that ReaRec significantly elevates the performance ceiling of multiple sequential recommendation backbones by approximately 30\%-50\%. Thus, we believe this work can open a new and promising avenue for future research in inference-time computing for sequential recommendation.

Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation

TL;DR

This work introduces ReaRec, an inference-time reasoning framework for sequential recommendation that augments traditional forward-only models with multi-step implicit reasoning. By injecting reasoning steps through reasoning position embeddings and employing ERL and PRL, the approach yields robust gains across diverse backbones and datasets, notably boosting performance for long-tail users and items while keeping latency modest. Extensive experiments demonstrate consistent improvements (often 7–12% on average) and reveal that the optimal reasoning depth is typically modest (around two steps), with post-hoc analyses indicating substantial potential for lifting the performance ceiling. The study highlights a promising new direction toward depth-aware, inference-time computation in recommender systems and suggests avenues for adaptive, efficient reasoning in deployment scenarios.

Abstract

Sequential Recommendation (SeqRec) aims to predict the next item by capturing sequential patterns from users' historical interactions, playing a crucial role in many real-world recommender systems. However, existing approaches predominantly adopt a direct forward computation paradigm, where the final hidden state of the sequence encoder serves as the user representation. We argue that this inference paradigm, due to its limited computational depth, struggles to model the complex evolving nature of user preferences and lacks a nuanced understanding of long-tail items, leading to suboptimal performance. To address this issue, we propose \textbf{ReaRec}, the first inference-time computing framework for recommender systems, which enhances user representations through implicit multi-step reasoning. Specifically, ReaRec autoregressively feeds the sequence's last hidden state into the sequential recommender while incorporating special reasoning position embeddings to decouple the original item encoding space from the multi-step reasoning space. Moreover, we introduce two lightweight reasoning-based learning methods, Ensemble Reasoning Learning (ERL) and Progressive Reasoning Learning (PRL), to further effectively exploit ReaRec's reasoning potential. Extensive experiments on five public real-world datasets and different SeqRec architectures demonstrate the generality and effectiveness of our proposed ReaRec. Remarkably, post-hoc analyses reveal that ReaRec significantly elevates the performance ceiling of multiple sequential recommendation backbones by approximately 30\%-50\%. Thus, we believe this work can open a new and promising avenue for future research in inference-time computing for sequential recommendation.

Paper Structure

This paper contains 40 sections, 14 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Illustration of traditional direct inference (i.e., reasoning-free) and our proposed multi-step reasoning-enhanced sequential recommendation framework.
  • Figure 2: Empirical performance gains and potential upper bound analysis of optimal reasoning steps ($\mathbf{K=2}$) on Yelp dataset across different SeqRec models.
  • Figure 3: Overview of the proposed ReaRec framework and two reasoning-enhanced learning strategies: Ensemble Reasoning Learning and Progressive Reasoning Learning.
  • Figure 4: Robustness study w.r.t different user and item subgroups on Yelp dataset. 'Step-$x$' represents the recommendation performance at the $x$-th reasoning step. 'UG' and 'IG' denote User and Item Group, respectively, where higher group numbers indicate longer sequences and more popular items.
  • Figure 5: The performance variation trend of different methods under different reasoning steps.
  • ...and 6 more figures