Uncovering Selective State Space Model's Capabilities in Lifelong Sequential Recommendation
Jiyuan Yang, Yuanzi Li, Jingyu Zhao, Hanbing Wang, Muyang Ma, Jun Ma, Zhaochun Ren, Mengqi Zhang, Xin Xin, Zhumin Chen, Pengjie Ren
TL;DR
The paper tackles lifelong sequential recommendation by introducing RecMamba, a framework that replaces Transformer-style attention with a selective Mamba block to model long user sequences efficiently. The approach achieves comparable or superior performance to strong baselines while dramatically reducing training time and memory usage, particularly on very long sequences (2k–5k). Extensive experiments on KuaiRand and LFM-1b demonstrate that longer sequences yield better recommendations and that RecMamba excels in efficiency, making it practical for scalable lifelong sequence modeling. The work highlights the potential of selective state-space models to balance modeling power and computational cost in real-world recommender systems.
Abstract
Sequential Recommenders have been widely applied in various online services, aiming to model users' dynamic interests from their sequential interactions. With users increasingly engaging with online platforms, vast amounts of lifelong user behavioral sequences have been generated. However, existing sequential recommender models often struggle to handle such lifelong sequences. The primary challenges stem from computational complexity and the ability to capture long-range dependencies within the sequence. Recently, a state space model featuring a selective mechanism (i.e., Mamba) has emerged. In this work, we investigate the performance of Mamba for lifelong sequential recommendation (i.e., length>=2k). More specifically, we leverage the Mamba block to model lifelong user sequences selectively. We conduct extensive experiments to evaluate the performance of representative sequential recommendation models in the setting of lifelong sequences. Experiments on two real-world datasets demonstrate the superiority of Mamba. We found that RecMamba achieves performance comparable to the representative model while significantly reducing training duration by approximately 70% and memory costs by 80%. Codes and data are available at \url{https://github.com/nancheng58/RecMamba}.
