Community-Invariant Graph Contrastive Learning
Shiyin Tan, Dongyuan Li, Renhe Jiang, Ying Zhang, Manabu Okumura
TL;DR
CI-GCL tackles the limitation that traditional graph contrastive learning can disrupt high-level graph communities during augmentation. By proving and leveraging a community-invariant principle, it unifies topology and feature augmentation under a spectral-change objective, maximizing changes in graph spectrum while preserving community structure. The framework combines differentiable topology perturbations with a bipartite feature augmentation and optimizes a joint loss that enforces community invariance, supported by theoretical insights and scalable algorithms. Empirically, CI-GCL achieves state-of-the-art or competitive results across 21 benchmarks in unsupervised, semi-supervised, and transfer settings, and shows improved robustness to noise and clear community preservation on synthetic data. This work offers a principled route to incorporate graph structure into learnable augmentations, with broad implications for generalization and transfer in graph representation learning.
Abstract
Graph augmentation has received great attention in recent years for graph contrastive learning (GCL) to learn well-generalized node/graph representations. However, mainstream GCL methods often favor randomly disrupting graphs for augmentation, which shows limited generalization and inevitably leads to the corruption of high-level graph information, i.e., the graph community. Moreover, current knowledge-based graph augmentation methods can only focus on either topology or node features, causing the model to lack robustness against various types of noise. To address these limitations, this research investigated the role of the graph community in graph augmentation and figured out its crucial advantage for learnable graph augmentation. Based on our observations, we propose a community-invariant GCL framework to maintain graph community structure during learnable graph augmentation. By maximizing the spectral changes, this framework unifies the constraints of both topology and feature augmentation, enhancing the model's robustness. Empirical evidence on 21 benchmark datasets demonstrates the exclusive merits of our framework. Code is released on Github (https://github.com/ShiyinTan/CI-GCL.git).
