Contrastive Learning for Sequential Recommendation
Xu Xie, Fei Sun, Zhaoyang Liu, Shiwen Wu, Jinyang Gao, Bolin Ding, Bin Cui
TL;DR
<3-5 sentence high-level summary> CL4SRec addresses data sparsity and evolving user interests in sequential recommendation by integrating a sequence-level contrastive learning objective with the standard next-item prediction. The framework uses a Transformer-based encoder to learn robust two-view representations of user sequences generated by three augmentation operators (crop, mask, reorder) and optimizes a multi-task loss that combines supervised and contrastive signals. Empirical results on four public datasets show state-of-the-art performance across sparse and dense regimes, with ablations demonstrating the effectiveness of the SSL component and the augmentation strategies. The work also demonstrates that CL4SRec learns more coherent user representations, validating its impact on practical recommendation quality.
Abstract
Sequential recommendation methods play a crucial role in modern recommender systems because of their ability to capture a user's dynamic interest from her/his historical interactions. Despite their success, we argue that these approaches usually rely on the sequential prediction task to optimize the huge amounts of parameters. They usually suffer from the data sparsity problem, which makes it difficult for them to learn high-quality user representations. To tackle that, inspired by recent advances of contrastive learning techniques in the computer version, we propose a novel multi-task model called \textbf{C}ontrastive \textbf{L}earning for \textbf{S}equential \textbf{Rec}ommendation~(\textbf{CL4SRec}). CL4SRec not only takes advantage of the traditional next item prediction task but also utilizes the contrastive learning framework to derive self-supervision signals from the original user behavior sequences. Therefore, it can extract more meaningful user patterns and further encode the user representation effectively. In addition, we propose three data augmentation approaches to construct self-supervision signals. Extensive experiments on four public datasets demonstrate that CL4SRec achieves state-of-the-art performance over existing baselines by inferring better user representations.
