Transformers Meet ACT-R: Repeat-Aware and Sequential Listening Session Recommendation
Viet-Anh Tran, Guillaume Salha-Galvan, Bruno Sguerra, Romain Hennequin
TL;DR
This work addresses repeat-aware sequential listening session recommendation by integrating ACT-R cognitive memory components with Transformer-based session embeddings to model both repetition and evolving user preferences. The proposed PISA framework uses a base-level, spreading, and partial matching mechanism to construct session representations, which are merged with long- and short-term user embeddings and optimized with a joint BPR-style and session-level loss. Empirical validation on Last.fm and Deezer shows that PISA, especially its PISA-P variant with popularity-based negative sampling, achieves strong recall and NDCG while balancing repetition and exploration; the authors also release their code and a Deezer dataset to promote reproducibility and further research. The work demonstrates the practical importance of modeling repetition in music recommendation and sets a foundation for future psychology-informed, dynamic recommender systems in sequential settings.
Abstract
Music streaming services often leverage sequential recommender systems to predict the best music to showcase to users based on past sequences of listening sessions. Nonetheless, most sequential recommendation methods ignore or insufficiently account for repetitive behaviors. This is a crucial limitation for music recommendation, as repeatedly listening to the same song over time is a common phenomenon that can even change the way users perceive this song. In this paper, we introduce PISA (Psychology-Informed Session embedding using ACT-R), a session-level sequential recommender system that overcomes this limitation. PISA employs a Transformer architecture learning embedding representations of listening sessions and users using attention mechanisms inspired by Anderson's ACT-R (Adaptive Control of Thought-Rational), a cognitive architecture modeling human information access and memory dynamics. This approach enables us to capture dynamic and repetitive patterns from user behaviors, allowing us to effectively predict the songs they will listen to in subsequent sessions, whether they are repeated or new ones. We demonstrate the empirical relevance of PISA using both publicly available listening data from Last.fm and proprietary data from Deezer, a global music streaming service, confirming the critical importance of repetition modeling for sequential listening session recommendation. Along with this paper, we publicly release our proprietary dataset to foster future research in this field, as well as the source code of PISA to facilitate its future use.
