Graph Contrastive Learning with Cohesive Subgraph Awareness

Yucheng Wu; Leye Wang; Xiao Han; Han-Jia Ye

Graph Contrastive Learning with Cohesive Subgraph Awareness

Yucheng Wu, Leye Wang, Xiao Han, Han-Jia Ye

TL;DR

This work tackles the sensitivity of graph contrastive learning (GCL) to topology augmentations by introducing CTAug, a cohesion-aware framework that preserves cohesive subgraphs during augmentation and strengthens subgraph-aware learning. It unifies two modules—Topology Augmentation Enhancement and Graph Learning Enhancement—into a single framework that can augment existing GCL methods (e.g., GraphCL, JOAO, MVGRL) using cohesive priors like $k$-core and $k$-truss. The proposed methods include probabilistic and deterministic augmentation refinements, an original-graph-oriented substructure network (O-GSN), and multi-cohesion embedding fusion, with theoretical mutual-information insights and extensive experiments showing notable gains on high-degree graphs and competitive performance on node-level tasks. The results underscore the practical value of incorporating cohesive subgraph knowledge into self-supervised graph representation learning and point toward broader applicability to diverse substructures and domains.

Abstract

Graph contrastive learning (GCL) has emerged as a state-of-the-art strategy for learning representations of diverse graphs including social and biomedical networks. GCL widely uses stochastic graph topology augmentation, such as uniform node dropping, to generate augmented graphs. However, such stochastic augmentations may severely damage the intrinsic properties of a graph and deteriorate the following representation learning process. We argue that incorporating an awareness of cohesive subgraphs during the graph augmentation and learning processes has the potential to enhance GCL performance. To this end, we propose a novel unified framework called CTAug, to seamlessly integrate cohesion awareness into various existing GCL mechanisms. In particular, CTAug comprises two specialized modules: topology augmentation enhancement and graph learning enhancement. The former module generates augmented graphs that carefully preserve cohesion properties, while the latter module bolsters the graph encoder's ability to discern subgraph patterns. Theoretical analysis shows that CTAug can strictly improve existing GCL mechanisms. Empirical experiments verify that CTAug can achieve state-of-the-art performance for graph representation learning, especially for graphs with high degrees. The code is available at https://doi.org/10.5281/zenodo.10594093, or https://github.com/wuyucheng2002/CTAug.

Graph Contrastive Learning with Cohesive Subgraph Awareness

TL;DR

-core and

-truss. The proposed methods include probabilistic and deterministic augmentation refinements, an original-graph-oriented substructure network (O-GSN), and multi-cohesion embedding fusion, with theoretical mutual-information insights and extensive experiments showing notable gains on high-degree graphs and competitive performance on node-level tasks. The results underscore the practical value of incorporating cohesive subgraph knowledge into self-supervised graph representation learning and point toward broader applicability to diverse substructures and domains.

Abstract

Paper Structure (34 sections, 6 theorems, 15 equations, 4 figures, 16 tables)

This paper contains 34 sections, 6 theorems, 15 equations, 4 figures, 16 tables.

Introduction
Background and Related Work
Cohesive Subgraph
Topology Augmentation in GCL
The CTAug Framework
Topology Augmentation Enhancement
Probabilistic Topology Augmentation
Deterministic Topology Augmentation
Graph Learning Enhancement
Subgraph-aware GNN Encoder
Multi-Cohesion Embedding Fusion
Extension for Node Embedding Learning
How CTAug Powers GCL?
Topology Augmentation Enhancement
Graph Learning Enhancement
...and 19 more sections

Key Result

Theorem 4.3

Suppose $f$ is a minimal sufficient encoder. If $I(\mathcal{G}';\mathcal{G};y)$ increases, then $I(f(\mathcal{G});y)$ will also increase.

Figures (4)

Figure 1: Overview of the CTAug Framework. Module 1 enhances the probabilistic and deterministic augmentation process separately with the consideration of the cohesive subgraphs; Module 2 boosts GNN encoder to better capture the original graph's cohesion properties.
Figure 2: CTAug's improvement on datasets with varying average degrees.
Figure 3: Scalability test on RDT-T.
Figure 4: Histogram of $k_{\max}$ ($k$-core).

Theorems & Definitions (10)

Definition 4.1
Definition 4.2
Theorem 4.3
Theorem 4.4
Lemma A.1
Definition A.2
Definition A.3
Theorem A.4
Lemma A.5
Theorem A.6

Graph Contrastive Learning with Cohesive Subgraph Awareness

TL;DR

Abstract

Graph Contrastive Learning with Cohesive Subgraph Awareness

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (10)