Perfect Alignment May be Poisonous to Graph Contrastive Learning

Jingyu Liu; Huayi Tang; Yong Liu

Perfect Alignment May be Poisonous to Graph Contrastive Learning

Jingyu Liu, Huayi Tang, Yong Liu

TL;DR

The work questions the necessity of perfect alignment in Graph Contrastive Learning by showing that stronger augmentation primarily enhances downstream performance through inter-class separation rather than intra-class gathering. It develops a theoretical link between augmentation magnitude, contrastive loss, and generalization using both information-theoretic and graph-spectral analyses and derives a bound that explains how larger augmentation can improve generalization while potentially complicating optimization. The authors propose two practical augmentation strategies—information-based augmentation that preserves important components and spectrum-based augmentation that reshapes the spectrum—and validate them across multiple GCL methods and datasets, demonstrating improved downstream accuracy. This work offers a principled path for choosing augmentation strength and content in GCL, with potential to guide the design of more effective contrastive objectives and augmentation schemes in graph learning.

Abstract

Graph Contrastive Learning (GCL) aims to learn node representations by aligning positive pairs and separating negative ones. However, few of researchers have focused on the inner law behind specific augmentations used in graph-based learning. What kind of augmentation will help downstream performance, how does contrastive learning actually influence downstream tasks, and why the magnitude of augmentation matters so much? This paper seeks to address these questions by establishing a connection between augmentation and downstream performance. Our findings reveal that GCL contributes to downstream tasks mainly by separating different classes rather than gathering nodes of the same class. So perfect alignment and augmentation overlap which draw all intra-class samples the same can not fully explain the success of contrastive learning. Therefore, in order to understand how augmentation aids the contrastive learning process, we conduct further investigations into the generalization, finding that perfect alignment that draw positive pair the same could help contrastive loss but is poisonous to generalization, as a result, perfect alignment may not lead to best downstream performance, so specifically designed augmentation is needed to achieve appropriate alignment performance and improve downstream accuracy. We further analyse the result by information theory and graph spectrum theory and propose two simple but effective methods to verify the theories. The two methods could be easily applied to various GCL algorithms and extensive experiments are conducted to prove its effectiveness. The code is available at https://github.com/somebodyhh1/GRACEIS

Perfect Alignment May be Poisonous to Graph Contrastive Learning

TL;DR

Abstract

Paper Structure (36 sections, 8 theorems, 42 equations, 14 figures, 6 tables)

This paper contains 36 sections, 8 theorems, 42 equations, 14 figures, 6 tables.

Introduction
Augmentation and Generalization
Preliminaries
How Does Augmentation Affect Downstream Performance
Augmentation and Generalization
Finding Better Augmentation
Information Theory Perspective
Graph Spectrum Perspective
Experiments
Augmentation Distance
Over-smooth
Conclusion
Theoretical Proof
Proof of Theorem \ref{['theorem:aug_cla']}
Proof of Theorem \ref{['theorem:generalization']}
...and 21 more sections

Key Result

Theorem 2.4

If Assumption assumption:view_invariance holds, we know that:

Figures (14)

Figure 1: PCD means positive center distance ($\mathbb{E}_{p(v_y^{0}|y)}||f(v_y^{0})-\mu_y||$), NCD means negative center distance ($\mathbb{E}_{p(v_y^{0}|y)}||f(v_y^{0})-\mu_{y^-}||$) and accuracy is the downstream performance. X-axis stands for dropout rate of both edge and feature.
Figure 2: Augmentation distance and InfoNCE, GRACE+I stands for GRACE with information augmentation, and GRACE+S stands for GRACE with spectrum augmentation. GRACE+x$\_$MI means mutual information between two views after training, and GRACE+x$\_\delta_{aug}$ is augmentation distance caused by the method.
Figure 3: Accuracy on downstream tasks with different number of layers.
Figure 4: influence of $p_{\tau}$ on Cora (all the data are normalized for better visualization)
Figure 5: Percentage of positive $\theta$
...and 9 more figures

Theorems & Definitions (10)

Definition 2.3
Theorem 2.4: Augmentation and Classification
Definition 2.5: Mean CE loss
Theorem 2.6: Generalization and Augmentation Distance
Corollary 3.1: CE with Mutual Information
Theorem 3.2: Theorem 1 of GCL_specturm_NCE Restated
Corollary 3.3: Spectral Representation of $\delta_{aug}$
Lemma 1.1: reversed_jensen Corollary 3.5 (restated)
Lemma 1.2: chaos Lemma A.2. restated
Lemma 2.1: Change of Spectrum

Perfect Alignment May be Poisonous to Graph Contrastive Learning

TL;DR

Abstract

Perfect Alignment May be Poisonous to Graph Contrastive Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (10)