Table of Contents
Fetching ...

Simple and Behavior-Driven Augmentation for Recommendation with Rich Collaborative Signals

Doyun Choi, Cheonwoo Lee, Jaemin Yoo

TL;DR

This work addresses the challenge of improving contrastive learning for graph-based collaborative filtering under sparse, implicit feedback, where denoising-based augmentations risk removing core informative signals. It introduces Simple Collaborative Augmentation for Recommendation (SCAR), which uses two behavior-driven augmentations—ColAdd and ColRep—to insert or replace pseudo-interactions informed by collaborative signals, rather than removing edges. The approach is analyzed in terms of effectiveness, complexity, and multi-hop signal capture, and is validated through extensive experiments showing strong gains over CL baselines and MAE-based methods, particularly on sparse datasets, along with robust hyperparameter stability and interpretability. The proposed method offers a scalable, transparent augmentation paradigm that enhances representation learning and downstream recommendation performance in practical settings.

Abstract

Contrastive learning (CL) has been widely used for enhancing the performance of graph collaborative filtering (GCF) for personalized recommendation. Since data augmentation plays a crucial role in the success of CL, previous works have designed augmentation methods to remove noisy interactions between users and items in order to generate effective augmented views. However, the ambiguity in defining ''noisiness'' presents a persistent risk of losing core information and generating unreliable data views, while increasing the overall complexity of augmentation. In this paper, we propose Simple Collaborative Augmentation for Recommendation (SCAR), a novel and intuitive augmentation method designed to maximize the effectiveness of CL for GCF. Instead of removing information, SCAR leverages collaborative signals extracted from user-item interactions to generate pseudo-interactions, which are then either added to or used to replace existing interactions. This results in more robust representations while avoiding the pitfalls of overly complex augmentation modules. We conduct experiments on four benchmark datasets and show that SCAR outperforms previous CL-based GCF methods as well as other state-of-the-art self-supervised learning approaches across key evaluation metrics. SCAR exhibits strong robustness across different hyperparameter settings and is particularly effective in sparse data scenarios.

Simple and Behavior-Driven Augmentation for Recommendation with Rich Collaborative Signals

TL;DR

This work addresses the challenge of improving contrastive learning for graph-based collaborative filtering under sparse, implicit feedback, where denoising-based augmentations risk removing core informative signals. It introduces Simple Collaborative Augmentation for Recommendation (SCAR), which uses two behavior-driven augmentations—ColAdd and ColRep—to insert or replace pseudo-interactions informed by collaborative signals, rather than removing edges. The approach is analyzed in terms of effectiveness, complexity, and multi-hop signal capture, and is validated through extensive experiments showing strong gains over CL baselines and MAE-based methods, particularly on sparse datasets, along with robust hyperparameter stability and interpretability. The proposed method offers a scalable, transparent augmentation paradigm that enhances representation learning and downstream recommendation performance in practical settings.

Abstract

Contrastive learning (CL) has been widely used for enhancing the performance of graph collaborative filtering (GCF) for personalized recommendation. Since data augmentation plays a crucial role in the success of CL, previous works have designed augmentation methods to remove noisy interactions between users and items in order to generate effective augmented views. However, the ambiguity in defining ''noisiness'' presents a persistent risk of losing core information and generating unreliable data views, while increasing the overall complexity of augmentation. In this paper, we propose Simple Collaborative Augmentation for Recommendation (SCAR), a novel and intuitive augmentation method designed to maximize the effectiveness of CL for GCF. Instead of removing information, SCAR leverages collaborative signals extracted from user-item interactions to generate pseudo-interactions, which are then either added to or used to replace existing interactions. This results in more robust representations while avoiding the pitfalls of overly complex augmentation modules. We conduct experiments on four benchmark datasets and show that SCAR outperforms previous CL-based GCF methods as well as other state-of-the-art self-supervised learning approaches across key evaluation metrics. SCAR exhibits strong robustness across different hyperparameter settings and is particularly effective in sparse data scenarios.

Paper Structure

This paper contains 20 sections, 3 theorems, 16 equations, 7 figures, 5 tables.

Key Result

Proposition 1

An item $i$ is an effective candidate for a user $u$ if $i$ is similar to the items that $u$ has already interacted with.

Figures (7)

  • Figure 1: Performance comparison on the Yelp and LastFM datasets using the original graph for training and the augmented graphs for inference. "Denoising" original edges degrades performance, even with a learnable denoiser 10.1145/3580305.3599768, highlighting its risk. In contrast, augmented graphs from our methods, i.e., OurAug1 and OurAug2, remain stable.
  • Figure 2: The overall framework of SCAR. At each training epoch, ColAdd and ColRep generate augmented views by inserting pseudo-interactions into the data, and these views are used for contrastive learning. The original graph's representations are used for the main recommendation task.
  • Figure 3: Visualization of the two SCAR augmentation methods: ColAdd (left) and ColRep (right). For ColAdd, the top-$k$ items with the highest normalized effectiveness scores are sampled for each randomly selected user and then added through new weighted edges. On the other hand, ColRep replaces the edges of selected users connected to the least effective items with new edges to the items that have the most similar behavioral features to the replaced items.
  • Figure 4: Illustration on how our augmentation methods reflect multi-hop collaborative signals and preserve core node features. In ColAdd (center), original interactions remain untouched, while pseudo-interactions with 3-hop items are added, allowing up to 5-hop collaborative signals to be incorporated into the representation. In ColRep (right), the least effective edge is replaced with an edge connected to a 3-hop item, partially preserving the 2-hop signals lost due to edge removal, while retaining signals from the removed edge. This process also incorporates multi-hop collaborative signals, similar to ColAdd.
  • Figure 5: Performance across user groups with different interaction sparsity levels on Gowalla and Yelp. SCAR is generally effective and is particularly strong for sparse users.
  • ...and 2 more figures

Theorems & Definitions (3)

  • Proposition 1: item-based pseudo-interactions
  • Proposition 2: user-based pseudo-interactions
  • Proposition 3: unrelated items