Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

Xiangru Jian; Xinjian Zhao; Wei Pang; Chaolong Ying; Yimu Wang; Yaoyao Xu; Tianshu Yu

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

Xiangru Jian, Xinjian Zhao, Wei Pang, Chaolong Ying, Yimu Wang, Yaoyao Xu, Tianshu Yu

TL;DR

This work critically evaluates the role of spectral augmentation in contrast-based graph self-supervised learning. By combining theoretical bounds on the InfoNCE loss for shallow GNNs with extensive empirical comparisons across node- and graph-level tasks, it shows that simple edge perturbations such as DropEdge and AddEdge not only match but frequently outperform spectral augmentation methods, while also offering far greater computational efficiency. The analysis reveals that spectral cues are largely inaccessible to shallow encoders and that optimizing augmentation strength can yield more practical gains than complex spectral manipulations. Collectively, the findings suggest a shift toward simple topology perturbations in CG-SSL and highlight the need to probe deeper architectures if spectral information is to be effectively leveraged.

Abstract

The recent surge in contrast-based graph self-supervised learning has prominently featured an intensified exploration of spectral cues. Spectral augmentation, which involves modifying a graph's spectral properties such as eigenvalues or eigenvectors, is widely believed to enhance model performance. However, an intriguing paradox emerges, as methods grounded in seemingly conflicting assumptions regarding the spectral domain demonstrate notable enhancements in learning performance. Through extensive empirical studies, we find that simple edge perturbations - random edge dropping for node-level and random edge adding for graph-level self-supervised learning - consistently yield comparable or superior performance while being significantly more computationally efficient. This suggests that the computational overhead of sophisticated spectral augmentations may not justify their practical benefits. Our theoretical analysis of the InfoNCE loss bounds for shallow GNNs further supports this observation. The proposed insights represent a significant leap forward in the field, potentially refining the understanding and implementation of graph self-supervised learning.

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

TL;DR

Abstract

Paper Structure (44 sections, 8 theorems, 91 equations, 4 figures, 12 tables)

This paper contains 44 sections, 8 theorems, 91 equations, 4 figures, 12 tables.

Introduction
Related work
Preliminary study
Limitations of spectral augmentations
Numerical Estimation and Interpretation.
Edge perturbation is all you need
Advantage of edge perturbation over spectral augmentations
Optimal learning performance.
Experiments on SSL performance
Experimental Settings
Experimental results
Ablation Study
The insignificance of Spectral Cues
Degeneration of the spectrum after Edge Perturbation (EP)
Spectral Perturbation
...and 29 more sections

Key Result

Theorem 1

Given a graph $\mathcal{G}$ with minimum degree $d_{\min}$ and maximum degree $d_{\max}$, and its augmentation $\mathcal{G}'$ with local topological perturbation strength $\delta$, for a $k$-layer GNN with ReLU activation and weight matrices satisfying $\left\| \mathbf{W}^{(l)} \right\|_2 \leq L_W$, where $\epsilon$ is as defined in Lemma lemma5 and $\epsilon'$ is as defined in Lemma lemma6. Det

Figures (4)

Figure 1: Accuracy of CG-SSL vs. number of GCN layers on node and graph classification on four datasets. (a) G-BT on node classification. (b) MVGRL on node classification. (c) G-BT on graph classification. (d) MVGRL on graph classification. We choose two representative datasets for each task, i.e. Cora and CiteSeer for the node-level and PROTEINS and IMDB-BINARY for the graph-level classification. The evaluation protocol, along with dataset details and other experimental settings, are provided in Section \ref{['subsec:exp:setting']}.
Figure 2: The spectrum distributions of graphs on different graph classification datasets. MUTAG and PROTEINS are chosen as they are well representative of all the node classification datasets. OG means original graph and AUG means augmented graph. The augmentation method is AddEdge with the best parameter on G-BT method.
Figure 3: The spectrum distributions of graphs on different node classification datasets. Cora, CiteSeer, and Computers are chosen as they are well representative of all the node classification datasets. OG means original graph and AUG means average augmented graphs. The augmentation method is DropEdge with the best parameter on G-BT method.
Figure 4: Comparison of SPAN performance before and after applying SPA. After severely disrupting the spectral, the performance of SPAN is still comparable to that of the original version.

Theorems & Definitions (17)

Theorem 1: InfoNCE Loss Bounds
Definition 1: Local Topological Perturbation
Definition 2: InfoNCE Loss
Lemma 1: Adjacency Matrix Perturbation
proof
Lemma 2: Degree Matrix Change
proof
Lemma 3: Bounded Change in Normalized Adjacency Matrix
proof
Lemma 4: GNN Output Difference Bound
...and 7 more

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

TL;DR

Abstract

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (17)