AsarRec: Adaptive Sequential Augmentation for Robust Self-supervised Sequential Recommendation
Kaike Zhang, Qi Cao, Fei Sun, Xinran Liu
TL;DR
AsarRec addresses the fragility of self-supervised sequential recommendations under noisy user behavior by learning adaptive augmentation strategies. It unifies augmentation operations into a constrained matrix framework and uses a differentiable Semi-Sinkhorn process to produce per-user transformation matrices, optimized for diversity, semantic invariance, and informativeness. Empirical results across three datasets and multiple backbones show state-of-the-art robustness and consistent gains over static augmentation baselines, including under synthetic noise. The approach demonstrates strong generalization and provides insight into how augmentations should adapt to data characteristics and noise levels in real-world applications.
Abstract
Sequential recommender systems have demonstrated strong capabilities in modeling users' dynamic preferences and capturing item transition patterns. However, real-world user behaviors are often noisy due to factors such as human errors, uncertainty, and behavioral ambiguity, which can lead to degraded recommendation performance. To address this issue, recent approaches widely adopt self-supervised learning (SSL), particularly contrastive learning, by generating perturbed views of user interaction sequences and maximizing their mutual information to improve model robustness. However, these methods heavily rely on their pre-defined static augmentation strategies~(where the augmentation type remains fixed once chosen) to construct augmented views, leading to two critical challenges: (1) the optimal augmentation type can vary significantly across different scenarios; (2) inappropriate augmentations may even degrade recommendation performance, limiting the effectiveness of SSL. To overcome these limitations, we propose an adaptive augmentation framework. We first unify existing basic augmentation operations into a unified formulation via structured transformation matrices. Building on this, we introduce AsarRec (Adaptive Sequential Augmentation for Robust Sequential Recommendation), which learns to generate transformation matrices by encoding user sequences into probabilistic transition matrices and projecting them into hard semi-doubly stochastic matrices via a differentiable Semi-Sinkhorn algorithm. To ensure that the learned augmentations benefit downstream performance, we jointly optimize three objectives: diversity, semantic invariance, and informativeness. Extensive experiments on three benchmark datasets under varying noise levels validate the effectiveness of AsarRec, demonstrating its superior robustness and consistent improvements.
