Simple and Behavior-Driven Augmentation for Recommendation with Rich Collaborative Signals
Doyun Choi, Cheonwoo Lee, Jaemin Yoo
TL;DR
This work addresses the challenge of improving contrastive learning for graph-based collaborative filtering under sparse, implicit feedback, where denoising-based augmentations risk removing core informative signals. It introduces Simple Collaborative Augmentation for Recommendation (SCAR), which uses two behavior-driven augmentations—ColAdd and ColRep—to insert or replace pseudo-interactions informed by collaborative signals, rather than removing edges. The approach is analyzed in terms of effectiveness, complexity, and multi-hop signal capture, and is validated through extensive experiments showing strong gains over CL baselines and MAE-based methods, particularly on sparse datasets, along with robust hyperparameter stability and interpretability. The proposed method offers a scalable, transparent augmentation paradigm that enhances representation learning and downstream recommendation performance in practical settings.
Abstract
Contrastive learning (CL) has been widely used for enhancing the performance of graph collaborative filtering (GCF) for personalized recommendation. Since data augmentation plays a crucial role in the success of CL, previous works have designed augmentation methods to remove noisy interactions between users and items in order to generate effective augmented views. However, the ambiguity in defining ''noisiness'' presents a persistent risk of losing core information and generating unreliable data views, while increasing the overall complexity of augmentation. In this paper, we propose Simple Collaborative Augmentation for Recommendation (SCAR), a novel and intuitive augmentation method designed to maximize the effectiveness of CL for GCF. Instead of removing information, SCAR leverages collaborative signals extracted from user-item interactions to generate pseudo-interactions, which are then either added to or used to replace existing interactions. This results in more robust representations while avoiding the pitfalls of overly complex augmentation modules. We conduct experiments on four benchmark datasets and show that SCAR outperforms previous CL-based GCF methods as well as other state-of-the-art self-supervised learning approaches across key evaluation metrics. SCAR exhibits strong robustness across different hyperparameter settings and is particularly effective in sparse data scenarios.
