Perfect Alignment May be Poisonous to Graph Contrastive Learning
Jingyu Liu, Huayi Tang, Yong Liu
TL;DR
The work questions the necessity of perfect alignment in Graph Contrastive Learning by showing that stronger augmentation primarily enhances downstream performance through inter-class separation rather than intra-class gathering. It develops a theoretical link between augmentation magnitude, contrastive loss, and generalization using both information-theoretic and graph-spectral analyses and derives a bound that explains how larger augmentation can improve generalization while potentially complicating optimization. The authors propose two practical augmentation strategies—information-based augmentation that preserves important components and spectrum-based augmentation that reshapes the spectrum—and validate them across multiple GCL methods and datasets, demonstrating improved downstream accuracy. This work offers a principled path for choosing augmentation strength and content in GCL, with potential to guide the design of more effective contrastive objectives and augmentation schemes in graph learning.
Abstract
Graph Contrastive Learning (GCL) aims to learn node representations by aligning positive pairs and separating negative ones. However, few of researchers have focused on the inner law behind specific augmentations used in graph-based learning. What kind of augmentation will help downstream performance, how does contrastive learning actually influence downstream tasks, and why the magnitude of augmentation matters so much? This paper seeks to address these questions by establishing a connection between augmentation and downstream performance. Our findings reveal that GCL contributes to downstream tasks mainly by separating different classes rather than gathering nodes of the same class. So perfect alignment and augmentation overlap which draw all intra-class samples the same can not fully explain the success of contrastive learning. Therefore, in order to understand how augmentation aids the contrastive learning process, we conduct further investigations into the generalization, finding that perfect alignment that draw positive pair the same could help contrastive loss but is poisonous to generalization, as a result, perfect alignment may not lead to best downstream performance, so specifically designed augmentation is needed to achieve appropriate alignment performance and improve downstream accuracy. We further analyse the result by information theory and graph spectrum theory and propose two simple but effective methods to verify the theories. The two methods could be easily applied to various GCL algorithms and extensive experiments are conducted to prove its effectiveness. The code is available at https://github.com/somebodyhh1/GRACEIS
