Table of Contents
Fetching ...

Your Graph Recommender is Provably a Single-view Graph Contrastive Learning

Wenjie Yang, Shengzhong Zhang, Jiaxing Guo, Zengfeng Huang

TL;DR

The paper investigates the relationship between Graph Recommender (GR) systems and Graph Contrastive Learning (GCL), proving that graph recommender encoders are equivalent to a single-view graph contrastive learning model and that the GR loss $\mathcal{L}_{\mathrm{BPR}}$ can be bounded by a weighted single-view GCL loss $\mathcal{L}_{\mathrm{COLES}}$. It shows that LightGCN, a popular GR encoder, corresponds to a linear GCN with one-hot inputs and no nonlinearities/self-loops, providing theoretical justification for design choices in GR. The authors establish an explicit bound relating $\mathcal{L}_{\mathrm{BPR}}$ to $\mathcal{L}_{\mathrm{COLES}}^{+}$ and $\mathcal{L}_{\mathrm{COLES}}^{-}$ with dataset-driven constants, and demonstrate through extensive experiments that the recommendation loss and the GCL loss can be exchanged for training; in particular, GR models can be trained using COLES alone. This cross-field bridge enables knowledge transfer between GR and GCL, suggesting new directions for scalability, negative sampling, and joint or standalone training regimes across both communities.

Abstract

Graph recommender (GR) is a type of graph neural network (GNNs) encoder that is customized for extracting information from the user-item interaction graph. Due to its strong performance on the recommendation task, GR has gained significant attention recently. Graph contrastive learning (GCL) is also a popular research direction that aims to learn, often unsupervised, GNNs with certain contrastive objectives. As a general graph representation learning method, GCLs have been widely adopted with the supervised recommendation loss for joint training of GRs. Despite the intersection of GR and GCL research, theoretical understanding of the relationship between the two fields is surprisingly sparse. This vacancy inevitably leads to inefficient scientific research. In this paper, we aim to bridge the gap between the field of GR and GCL from the perspective of encoders and loss functions. With mild assumptions, we theoretically show an astonishing fact that graph recommender is equivalent to a commonly-used single-view graph contrastive model. Specifically, we find that (1) the classic encoder in GR is essentially a linear graph convolutional network with one-hot inputs, and (2) the loss function in GR is well bounded by a single-view GCL loss with certain hyperparameters. The first observation enables us to explain crucial designs of GR models, e.g., the removal of self-loop and nonlinearity. And the second finding can easily prompt many cross-field research directions. We empirically show a remarkable result that the recommendation loss and the GCL loss can be used interchangeably. The fact that we can train GR models solely with the GCL loss is particularly insightful, since before this work, GCLs were typically viewed as unsupervised methods that need fine-tuning. We also discuss some potential future works inspired by our theory.

Your Graph Recommender is Provably a Single-view Graph Contrastive Learning

TL;DR

The paper investigates the relationship between Graph Recommender (GR) systems and Graph Contrastive Learning (GCL), proving that graph recommender encoders are equivalent to a single-view graph contrastive learning model and that the GR loss can be bounded by a weighted single-view GCL loss . It shows that LightGCN, a popular GR encoder, corresponds to a linear GCN with one-hot inputs and no nonlinearities/self-loops, providing theoretical justification for design choices in GR. The authors establish an explicit bound relating to and with dataset-driven constants, and demonstrate through extensive experiments that the recommendation loss and the GCL loss can be exchanged for training; in particular, GR models can be trained using COLES alone. This cross-field bridge enables knowledge transfer between GR and GCL, suggesting new directions for scalability, negative sampling, and joint or standalone training regimes across both communities.

Abstract

Graph recommender (GR) is a type of graph neural network (GNNs) encoder that is customized for extracting information from the user-item interaction graph. Due to its strong performance on the recommendation task, GR has gained significant attention recently. Graph contrastive learning (GCL) is also a popular research direction that aims to learn, often unsupervised, GNNs with certain contrastive objectives. As a general graph representation learning method, GCLs have been widely adopted with the supervised recommendation loss for joint training of GRs. Despite the intersection of GR and GCL research, theoretical understanding of the relationship between the two fields is surprisingly sparse. This vacancy inevitably leads to inefficient scientific research. In this paper, we aim to bridge the gap between the field of GR and GCL from the perspective of encoders and loss functions. With mild assumptions, we theoretically show an astonishing fact that graph recommender is equivalent to a commonly-used single-view graph contrastive model. Specifically, we find that (1) the classic encoder in GR is essentially a linear graph convolutional network with one-hot inputs, and (2) the loss function in GR is well bounded by a single-view GCL loss with certain hyperparameters. The first observation enables us to explain crucial designs of GR models, e.g., the removal of self-loop and nonlinearity. And the second finding can easily prompt many cross-field research directions. We empirically show a remarkable result that the recommendation loss and the GCL loss can be used interchangeably. The fact that we can train GR models solely with the GCL loss is particularly insightful, since before this work, GCLs were typically viewed as unsupervised methods that need fine-tuning. We also discuss some potential future works inspired by our theory.
Paper Structure (16 sections, 7 theorems, 16 equations, 2 figures, 7 tables, 2 algorithms)

This paper contains 16 sections, 7 theorems, 16 equations, 2 figures, 7 tables, 2 algorithms.

Key Result

proposition 1

Let users and items have one-hot features, LightGCN is a GCN without non-linearity and self-loop.

Figures (2)

  • Figure 1: The ratio of negative coefficient $\beta_u/\beta_l$ on three real-world datasets.
  • Figure 2: Recall@20 and NDCG@20 on the Yelp2018 dataset with varying $\beta$.

Theorems & Definitions (7)

  • proposition 1
  • lemma 1: The relative influence in GNN lampert2023selfloop
  • theorem 2: The SNR of nonlinear and linear propagation model wei2022nonlinear
  • lemma 2
  • theorem 3
  • lemma 3
  • theorem 4