Table of Contents
Fetching ...

Generative Data Augmentation in Graph Contrastive Learning for Recommendation

Yansong Wang, Qihui Lin, Junjie Huang, Tao Jia

TL;DR

The paper tackles sparsity in recommendation by marrying graph contrastive learning with generative data augmentation. It introduces GDA4Rec, which uses a deep generative noise module to produce adaptive augmented views and derives an item complement matrix from user-item interactions to provide additional self-supervised signals, forming the objective $\mathcal{L}_{aug}=\mathcal{L}_{recon}+\mathcal{L}_{ddl}$. The framework integrates a LightGCN backbone with multi-view generation and multi-pair contrast, optimized via $\mathcal{L}=\mathcal{L}_{rec}+\lambda\mathcal{L}_{cl}+\mathcal{L}_{aug}+\mathcal{L}_{reg}$ to learn informative embeddings. Experiments on three public datasets show consistent gains over strong baselines, particularly in sparse settings, and ablation confirms the contributions of the generative augmentation and item complementarities. The work advances practical self-supervised signals for recommendation and provides code for reproducibility.

Abstract

Recommendation systems have become indispensable in various online platforms, from e-commerce to streaming services. A fundamental challenge in this domain is learning effective embeddings from sparse user-item interactions. While contrastive learning has recently emerged as a promising solution to this issue, generating augmented views for contrastive learning through most existing random data augmentation methods often leads to the alteration of original semantic information. In this paper, we propose a novel framework, GDA4Rec (Generative Data Augmentation in graph contrastive learning for Recommendation) to generate high-quality augmented views and provide robust self-supervised signals. Specifically, we employ a noise generation module that leverages deep generative models to approximate the distribution of original data for data augmentation. Additionally, GDA4Rec further extracts an item complement matrix to characterize the latent correlations between items and provide additional self-supervised signals. Lastly, a joint objective that integrates recommendation, data augmentation and contrastive learning is used to enforce the model to learn more effective and informative embeddings. Extensive experiments are conducted on three public datasets to demonstrate the superiority of the model. The code is available at: https://github.com/MrYansong/GDA4Rec.

Generative Data Augmentation in Graph Contrastive Learning for Recommendation

TL;DR

The paper tackles sparsity in recommendation by marrying graph contrastive learning with generative data augmentation. It introduces GDA4Rec, which uses a deep generative noise module to produce adaptive augmented views and derives an item complement matrix from user-item interactions to provide additional self-supervised signals, forming the objective . The framework integrates a LightGCN backbone with multi-view generation and multi-pair contrast, optimized via to learn informative embeddings. Experiments on three public datasets show consistent gains over strong baselines, particularly in sparse settings, and ablation confirms the contributions of the generative augmentation and item complementarities. The work advances practical self-supervised signals for recommendation and provides code for reproducibility.

Abstract

Recommendation systems have become indispensable in various online platforms, from e-commerce to streaming services. A fundamental challenge in this domain is learning effective embeddings from sparse user-item interactions. While contrastive learning has recently emerged as a promising solution to this issue, generating augmented views for contrastive learning through most existing random data augmentation methods often leads to the alteration of original semantic information. In this paper, we propose a novel framework, GDA4Rec (Generative Data Augmentation in graph contrastive learning for Recommendation) to generate high-quality augmented views and provide robust self-supervised signals. Specifically, we employ a noise generation module that leverages deep generative models to approximate the distribution of original data for data augmentation. Additionally, GDA4Rec further extracts an item complement matrix to characterize the latent correlations between items and provide additional self-supervised signals. Lastly, a joint objective that integrates recommendation, data augmentation and contrastive learning is used to enforce the model to learn more effective and informative embeddings. Extensive experiments are conducted on three public datasets to demonstrate the superiority of the model. The code is available at: https://github.com/MrYansong/GDA4Rec.

Paper Structure

This paper contains 29 sections, 22 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: (a) Interaction records between user1 and user2. (b) Embeddings of users and items in the latent space, where dashed and solid lines denote their positions in the original and augmented views, respectively.
  • Figure 2: The framework of the proposed model is shown above, which includes five parts: input, multi-view generation, data augmentation and encoding, multi-pair contrast and output.
  • Figure 3: The process of generating the adjacency matrix $A \in \mathbb{R} ^ {(m+n) \times (m+n)}$ and complement matrix $C \in \mathbb{R} ^ {n \times n}$ by the user-item graph $G_r$. $R \in \mathbb{R} ^ {m \times n}$ stands for interaction matrix.
  • Figure 4: Performance of different variants on NDCG@20 and Recall@20.
  • Figure 5: The learning trajectories of alignment and uniformity on CiaoDVD and Yelp2. The color of the points ranges from light to dark, indicating the progression from the early to the later stages of training.
  • ...and 2 more figures