Table of Contents
Fetching ...

Towards Graph Contrastive Learning: A Survey and Beyond

Wei Ju, Yifan Wang, Yifang Qin, Zhengyang Mao, Zhiping Xiao, Junyu Luo, Junwei Yang, Yiyang Gu, Dongjie Wang, Qingqing Long, Siyu Yi, Xiao Luo, Ming Zhang

TL;DR

This survey comprehensively analyzes Graph Contrastive Learning (GCL) within graph self-supervised learning, detailing data augmentation strategies, intra-/inter-scale contrastive modes, and both contrastive and non-contrastive optimization methods. It then extends GCL to data-efficient paradigms like weakly supervised and transfer learning, and surveys a wide range of real-world applications, including drug discovery and genomics. The work highlights key challenges—such as theoretical foundations, augmentation design, interpretability, and robustness—while outlining future directions to advance GCL's effectiveness and adoption. Overall, the paper positions GCL as a central, versatile component for scalable, label-efficient graph representation learning across diverse domains.

Abstract

In recent years, deep learning on graphs has achieved remarkable success in various domains. However, the reliance on annotated graph data remains a significant bottleneck due to its prohibitive cost and time-intensive nature. To address this challenge, self-supervised learning (SSL) on graphs has gained increasing attention and has made significant progress. SSL enables machine learning models to produce informative representations from unlabeled graph data, reducing the reliance on expensive labeled data. While SSL on graphs has witnessed widespread adoption, one critical component, Graph Contrastive Learning (GCL), has not been thoroughly investigated in the existing literature. Thus, this survey aims to fill this gap by offering a dedicated survey on GCL. We provide a comprehensive overview of the fundamental principles of GCL, including data augmentation strategies, contrastive modes, and contrastive optimization objectives. Furthermore, we explore the extensions of GCL to other aspects of data-efficient graph learning, such as weakly supervised learning, transfer learning, and related scenarios. We also discuss practical applications spanning domains such as drug discovery, genomics analysis, recommender systems, and finally outline the challenges and potential future directions in this field.

Towards Graph Contrastive Learning: A Survey and Beyond

TL;DR

This survey comprehensively analyzes Graph Contrastive Learning (GCL) within graph self-supervised learning, detailing data augmentation strategies, intra-/inter-scale contrastive modes, and both contrastive and non-contrastive optimization methods. It then extends GCL to data-efficient paradigms like weakly supervised and transfer learning, and surveys a wide range of real-world applications, including drug discovery and genomics. The work highlights key challenges—such as theoretical foundations, augmentation design, interpretability, and robustness—while outlining future directions to advance GCL's effectiveness and adoption. Overall, the paper positions GCL as a central, versatile component for scalable, label-efficient graph representation learning across diverse domains.

Abstract

In recent years, deep learning on graphs has achieved remarkable success in various domains. However, the reliance on annotated graph data remains a significant bottleneck due to its prohibitive cost and time-intensive nature. To address this challenge, self-supervised learning (SSL) on graphs has gained increasing attention and has made significant progress. SSL enables machine learning models to produce informative representations from unlabeled graph data, reducing the reliance on expensive labeled data. While SSL on graphs has witnessed widespread adoption, one critical component, Graph Contrastive Learning (GCL), has not been thoroughly investigated in the existing literature. Thus, this survey aims to fill this gap by offering a dedicated survey on GCL. We provide a comprehensive overview of the fundamental principles of GCL, including data augmentation strategies, contrastive modes, and contrastive optimization objectives. Furthermore, we explore the extensions of GCL to other aspects of data-efficient graph learning, such as weakly supervised learning, transfer learning, and related scenarios. We also discuss practical applications spanning domains such as drug discovery, genomics analysis, recommender systems, and finally outline the challenges and potential future directions in this field.
Paper Structure (31 sections, 38 equations, 4 figures, 5 tables)

This paper contains 31 sections, 38 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: An overview of the taxonomy for existing GCL models.
  • Figure 2: The general framework of graph contrastive learning (GCL). A contrastive method can be determined by defining its data augmentation strategy to generate different views, contrastive mode for the alignment between instances at different scales, and corresponding different contrastive optimization strategies.
  • Figure 3: GCL in graph weakly supervised learning.
  • Figure 4: GCL in graph transfer learning.