SPRINT: Scalable and Predictive Intent Refinement for LLM-Enhanced Session-based Recommendation
Gyuseok Lee, Wonbin Kweon, Zhenrui Yue, Yaokun Liu, Yifan Liu, Susik Yoon, Dong Wang, SeongKu Kang
TL;DR
This work tackles the dual challenges of context scarcity and scalability in applying LLM-based profiling to session-based recommendation. It introduces SPRINT, a two-stage framework that first derives reliable session intents using uncertainty-aware LLM invocation and a global intent pool, then leverages a lightweight intent predictor and collaborative enrichment to integrate multi-intent signals into SBR without LLMs at inference. The approach significantly improves recommendation accuracy while reducing training and inference costs, outperforming state-of-the-art baselines across three real-world datasets. The results demonstrate improved explainability and practical viability for real-time SBR deployments, highlighting the value of constrained, predicate-driven intent generation and cross-session collaboration. Overall, SPRINT offers a scalable, interpretable path to harness LLM knowledge for personalized, efficient session-based recommendations.
Abstract
Large language models (LLMs) have enhanced conventional recommendation models via user profiling, which generates representative textual profiles from users' historical interactions. However, their direct application to session-based recommendation (SBR) remains challenging due to severe session context scarcity and poor scalability. In this paper, we propose SPRINT, a scalable SBR framework that incorporates reliable and informative intents while ensuring high efficiency in both training and inference. SPRINT constrains LLM-based profiling with a global intent pool and validates inferred intents based on recommendation performance to mitigate noise and hallucinations under limited context. To ensure scalability, LLMs are selectively invoked only for uncertain sessions during training, while a lightweight intent predictor generalizes intent prediction to all sessions without LLM dependency at inference time. Experiments on real-world datasets show that SPRINT consistently outperforms state-of-the-art methods while providing more explainable recommendations.
