Table of Contents
Fetching ...

Community-Invariant Graph Contrastive Learning

Shiyin Tan, Dongyuan Li, Renhe Jiang, Ying Zhang, Manabu Okumura

TL;DR

CI-GCL tackles the limitation that traditional graph contrastive learning can disrupt high-level graph communities during augmentation. By proving and leveraging a community-invariant principle, it unifies topology and feature augmentation under a spectral-change objective, maximizing changes in graph spectrum while preserving community structure. The framework combines differentiable topology perturbations with a bipartite feature augmentation and optimizes a joint loss that enforces community invariance, supported by theoretical insights and scalable algorithms. Empirically, CI-GCL achieves state-of-the-art or competitive results across 21 benchmarks in unsupervised, semi-supervised, and transfer settings, and shows improved robustness to noise and clear community preservation on synthetic data. This work offers a principled route to incorporate graph structure into learnable augmentations, with broad implications for generalization and transfer in graph representation learning.

Abstract

Graph augmentation has received great attention in recent years for graph contrastive learning (GCL) to learn well-generalized node/graph representations. However, mainstream GCL methods often favor randomly disrupting graphs for augmentation, which shows limited generalization and inevitably leads to the corruption of high-level graph information, i.e., the graph community. Moreover, current knowledge-based graph augmentation methods can only focus on either topology or node features, causing the model to lack robustness against various types of noise. To address these limitations, this research investigated the role of the graph community in graph augmentation and figured out its crucial advantage for learnable graph augmentation. Based on our observations, we propose a community-invariant GCL framework to maintain graph community structure during learnable graph augmentation. By maximizing the spectral changes, this framework unifies the constraints of both topology and feature augmentation, enhancing the model's robustness. Empirical evidence on 21 benchmark datasets demonstrates the exclusive merits of our framework. Code is released on Github (https://github.com/ShiyinTan/CI-GCL.git).

Community-Invariant Graph Contrastive Learning

TL;DR

CI-GCL tackles the limitation that traditional graph contrastive learning can disrupt high-level graph communities during augmentation. By proving and leveraging a community-invariant principle, it unifies topology and feature augmentation under a spectral-change objective, maximizing changes in graph spectrum while preserving community structure. The framework combines differentiable topology perturbations with a bipartite feature augmentation and optimizes a joint loss that enforces community invariance, supported by theoretical insights and scalable algorithms. Empirically, CI-GCL achieves state-of-the-art or competitive results across 21 benchmarks in unsupervised, semi-supervised, and transfer settings, and shows improved robustness to noise and clear community preservation on synthetic data. This work offers a principled route to incorporate graph structure into learnable augmentations, with broad implications for generalization and transfer in graph representation learning.

Abstract

Graph augmentation has received great attention in recent years for graph contrastive learning (GCL) to learn well-generalized node/graph representations. However, mainstream GCL methods often favor randomly disrupting graphs for augmentation, which shows limited generalization and inevitably leads to the corruption of high-level graph information, i.e., the graph community. Moreover, current knowledge-based graph augmentation methods can only focus on either topology or node features, causing the model to lack robustness against various types of noise. To address these limitations, this research investigated the role of the graph community in graph augmentation and figured out its crucial advantage for learnable graph augmentation. Based on our observations, we propose a community-invariant GCL framework to maintain graph community structure during learnable graph augmentation. By maximizing the spectral changes, this framework unifies the constraints of both topology and feature augmentation, enhancing the model's robustness. Empirical evidence on 21 benchmark datasets demonstrates the exclusive merits of our framework. Code is released on Github (https://github.com/ShiyinTan/CI-GCL.git).
Paper Structure (48 sections, 9 theorems, 34 equations, 9 figures, 13 tables, 1 algorithm)

This paper contains 48 sections, 9 theorems, 34 equations, 9 figures, 13 tables, 1 algorithm.

Key Result

Theorem 1

The absolute spectral changes $\sum_{k=1}^{n} |\Delta \lambda_{k}|$ are upper bounded by $\left\|\mathbf{U}_{i \cdot}-\mathbf{U}_{ j \cdot}\right\|^{2}_{2} + \sum_{k=1}^{n} |\lambda_{k}-1|$ and lower bounded by $\left\|\mathbf{U}_{ i \cdot}-\mathbf{U}_{j \cdot}\right\|^{2}_{2} - \sum_{k=1}^{n} |\l

Figures (9)

  • Figure 1: In unsupervised graph classification, we define community changes as the average ratio of the changed community labels over the number of nodes before and after graph augmentation by spectral clustering. Spectral changes are the eigenvalue changes between original and augmented graphs, using the $L_{2}$ distance.
  • Figure 2: The proposed CI-GCL consists of two core components: (1) Learnable graph augmenter optimizes $T_{m}(G)$ to disrupt redundant information while ensuring community invariance from the original graph. (2) The GNN encoder $f_{\theta}(\cdot)$ and Readout $r_{\phi}(\cdot)$ maximize the mutual information between two augmented graphs by contrastive loss. We use edge dropping and feature masking as an instantiation.
  • Figure 3: Accuracy (%) under noise attack on two datasets.
  • Figure 4: A case study of TA and FA of GraphCL and CI-GCL. (B-C) share the same color map and Green lines are edge dropping and Red lines are edge adding.
  • Figure C1: The parameter sensitivity involves the selected number of spectrum $K$, the balance weight of the topological constraint $\boldsymbol{\alpha}$, and the balance weight of the feature constraint $\boldsymbol{\beta}$.
  • ...and 4 more figures

Theorems & Definitions (16)

  • Definition 1
  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Definition F1
  • Lemma F1
  • proof
  • Theorem F1
  • proof
  • Theorem F2
  • ...and 6 more