Table of Contents
Fetching ...

Mixed Supervised Graph Contrastive Learning for Recommendation

Weizhi Zhang, Liangwei Yang, Zihe Song, Henry Peng Zou, Ke Xu, Yuanjie Zhu, Philip S. Yu

TL;DR

MixSGCL tackles two core problems in graph-based recommendation: inconsistent optimization from decoupled supervised and self-supervised losses and data sparsity under unsupervised augmentation. By unifying training into a supervised graph contrastive loss $\mathcal{L}_{sgcl}$ and introducing node-level $N_{mix}$ and edge-level $EMix$ mixups, the method aligns gradient directions and injects direct supervised signals from existing user-item interactions. Empirical results on three real-world datasets show MixSGCL achieves top ranking performance with faster convergence and lower training time than strong baselines, validating the efficacy of coupled supervision in graph contrastive learning for RecSys. The approach offers a practical path toward more accurate and efficient recommendations in sparse settings with limited labeling signals.

Abstract

Recommender systems (RecSys) play a vital role in online platforms, offering users personalized suggestions amidst vast information. Graph contrastive learning aims to learn from high-order collaborative filtering signals with unsupervised augmentation on the user-item bipartite graph, which predominantly relies on the multi-task learning framework involving both the pair-wise recommendation loss and the contrastive loss. This decoupled design can cause inconsistent optimization direction from different losses, which leads to longer convergence time and even sub-optimal performance. Besides, the self-supervised contrastive loss falls short in alleviating the data sparsity issue in RecSys as it learns to differentiate users/items from different views without providing extra supervised collaborative filtering signals during augmentations. In this paper, we propose Mixed Supervised Graph Contrastive Learning for Recommendation (MixSGCL) to address these concerns. MixSGCL originally integrates the training of recommendation and unsupervised contrastive losses into a supervised contrastive learning loss to align the two tasks within one optimization direction. To cope with the data sparsity issue, instead unsupervised augmentation, we further propose node-wise and edge-wise mixup to mine more direct supervised collaborative filtering signals based on existing user-item interactions. Extensive experiments on three real-world datasets demonstrate that MixSGCL surpasses state-of-the-art methods, achieving top performance on both accuracy and efficiency. It validates the effectiveness of MixSGCL with our coupled design on supervised graph contrastive learning.

Mixed Supervised Graph Contrastive Learning for Recommendation

TL;DR

MixSGCL tackles two core problems in graph-based recommendation: inconsistent optimization from decoupled supervised and self-supervised losses and data sparsity under unsupervised augmentation. By unifying training into a supervised graph contrastive loss and introducing node-level and edge-level mixups, the method aligns gradient directions and injects direct supervised signals from existing user-item interactions. Empirical results on three real-world datasets show MixSGCL achieves top ranking performance with faster convergence and lower training time than strong baselines, validating the efficacy of coupled supervision in graph contrastive learning for RecSys. The approach offers a practical path toward more accurate and efficient recommendations in sparse settings with limited labeling signals.

Abstract

Recommender systems (RecSys) play a vital role in online platforms, offering users personalized suggestions amidst vast information. Graph contrastive learning aims to learn from high-order collaborative filtering signals with unsupervised augmentation on the user-item bipartite graph, which predominantly relies on the multi-task learning framework involving both the pair-wise recommendation loss and the contrastive loss. This decoupled design can cause inconsistent optimization direction from different losses, which leads to longer convergence time and even sub-optimal performance. Besides, the self-supervised contrastive loss falls short in alleviating the data sparsity issue in RecSys as it learns to differentiate users/items from different views without providing extra supervised collaborative filtering signals during augmentations. In this paper, we propose Mixed Supervised Graph Contrastive Learning for Recommendation (MixSGCL) to address these concerns. MixSGCL originally integrates the training of recommendation and unsupervised contrastive losses into a supervised contrastive learning loss to align the two tasks within one optimization direction. To cope with the data sparsity issue, instead unsupervised augmentation, we further propose node-wise and edge-wise mixup to mine more direct supervised collaborative filtering signals based on existing user-item interactions. Extensive experiments on three real-world datasets demonstrate that MixSGCL surpasses state-of-the-art methods, achieving top performance on both accuracy and efficiency. It validates the effectiveness of MixSGCL with our coupled design on supervised graph contrastive learning.
Paper Structure (32 sections, 9 equations, 8 figures, 4 tables)

This paper contains 32 sections, 9 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: The overall architecture of MixSGCL, including SGCL loss along with the Node/Edge-level Mixup for supervised augmentation. The pair of $(u_2,v_4)$ are augmented for each type mixup once, generating two extra pairs $(\hat{u}_2, \hat{v}_4)$ and $(\bar{u}_2, \bar{v}_4)$ supervision signals for SGCL loss optimization. The rightmost figure demonstrates their relations in the embedding space.
  • Figure 2: Ablation study on different components.
  • Figure 3: Trade-off between the performance and the efficiency on the Beauty dataset. The upper side indicates better performance; the left side represents more efficient training.
  • Figure 4: Performance curve of the Recall and NDCG in the first 30 epochs.
  • Figure 5: Performance trained with different portions of original training data (with more sparse supervision data).
  • ...and 3 more figures