Table of Contents
Fetching ...

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

Xiangru Jian, Xinjian Zhao, Wei Pang, Chaolong Ying, Yimu Wang, Yaoyao Xu, Tianshu Yu

TL;DR

This work critically evaluates the role of spectral augmentation in contrast-based graph self-supervised learning. By combining theoretical bounds on the InfoNCE loss for shallow GNNs with extensive empirical comparisons across node- and graph-level tasks, it shows that simple edge perturbations such as DropEdge and AddEdge not only match but frequently outperform spectral augmentation methods, while also offering far greater computational efficiency. The analysis reveals that spectral cues are largely inaccessible to shallow encoders and that optimizing augmentation strength can yield more practical gains than complex spectral manipulations. Collectively, the findings suggest a shift toward simple topology perturbations in CG-SSL and highlight the need to probe deeper architectures if spectral information is to be effectively leveraged.

Abstract

The recent surge in contrast-based graph self-supervised learning has prominently featured an intensified exploration of spectral cues. Spectral augmentation, which involves modifying a graph's spectral properties such as eigenvalues or eigenvectors, is widely believed to enhance model performance. However, an intriguing paradox emerges, as methods grounded in seemingly conflicting assumptions regarding the spectral domain demonstrate notable enhancements in learning performance. Through extensive empirical studies, we find that simple edge perturbations - random edge dropping for node-level and random edge adding for graph-level self-supervised learning - consistently yield comparable or superior performance while being significantly more computationally efficient. This suggests that the computational overhead of sophisticated spectral augmentations may not justify their practical benefits. Our theoretical analysis of the InfoNCE loss bounds for shallow GNNs further supports this observation. The proposed insights represent a significant leap forward in the field, potentially refining the understanding and implementation of graph self-supervised learning.

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

TL;DR

This work critically evaluates the role of spectral augmentation in contrast-based graph self-supervised learning. By combining theoretical bounds on the InfoNCE loss for shallow GNNs with extensive empirical comparisons across node- and graph-level tasks, it shows that simple edge perturbations such as DropEdge and AddEdge not only match but frequently outperform spectral augmentation methods, while also offering far greater computational efficiency. The analysis reveals that spectral cues are largely inaccessible to shallow encoders and that optimizing augmentation strength can yield more practical gains than complex spectral manipulations. Collectively, the findings suggest a shift toward simple topology perturbations in CG-SSL and highlight the need to probe deeper architectures if spectral information is to be effectively leveraged.

Abstract

The recent surge in contrast-based graph self-supervised learning has prominently featured an intensified exploration of spectral cues. Spectral augmentation, which involves modifying a graph's spectral properties such as eigenvalues or eigenvectors, is widely believed to enhance model performance. However, an intriguing paradox emerges, as methods grounded in seemingly conflicting assumptions regarding the spectral domain demonstrate notable enhancements in learning performance. Through extensive empirical studies, we find that simple edge perturbations - random edge dropping for node-level and random edge adding for graph-level self-supervised learning - consistently yield comparable or superior performance while being significantly more computationally efficient. This suggests that the computational overhead of sophisticated spectral augmentations may not justify their practical benefits. Our theoretical analysis of the InfoNCE loss bounds for shallow GNNs further supports this observation. The proposed insights represent a significant leap forward in the field, potentially refining the understanding and implementation of graph self-supervised learning.
Paper Structure (44 sections, 8 theorems, 91 equations, 4 figures, 12 tables)

This paper contains 44 sections, 8 theorems, 91 equations, 4 figures, 12 tables.

Key Result

Theorem 1

Given a graph $\mathcal{G}$ with minimum degree $d_{\min}$ and maximum degree $d_{\max}$, and its augmentation $\mathcal{G}'$ with local topological perturbation strength $\delta$, for a $k$-layer GNN with ReLU activation and weight matrices satisfying $\left\| \mathbf{W}^{(l)} \right\|_2 \leq L_W$, where $\epsilon$ is as defined in Lemma lemma5 and $\epsilon'$ is as defined in Lemma lemma6. Det

Figures (4)

  • Figure 1: Accuracy of CG-SSL vs. number of GCN layers on node and graph classification on four datasets. (a) G-BT on node classification. (b) MVGRL on node classification. (c) G-BT on graph classification. (d) MVGRL on graph classification. We choose two representative datasets for each task, i.e. Cora and CiteSeer for the node-level and PROTEINS and IMDB-BINARY for the graph-level classification. The evaluation protocol, along with dataset details and other experimental settings, are provided in Section \ref{['subsec:exp:setting']}.
  • Figure 2: The spectrum distributions of graphs on different graph classification datasets. MUTAG and PROTEINS are chosen as they are well representative of all the node classification datasets. OG means original graph and AUG means augmented graph. The augmentation method is AddEdge with the best parameter on G-BT method.
  • Figure 3: The spectrum distributions of graphs on different node classification datasets. Cora, CiteSeer, and Computers are chosen as they are well representative of all the node classification datasets. OG means original graph and AUG means average augmented graphs. The augmentation method is DropEdge with the best parameter on G-BT method.
  • Figure 4: Comparison of SPAN performance before and after applying SPA. After severely disrupting the spectral, the performance of SPAN is still comparable to that of the original version.

Theorems & Definitions (17)

  • Theorem 1: InfoNCE Loss Bounds
  • Definition 1: Local Topological Perturbation
  • Definition 2: InfoNCE Loss
  • Lemma 1: Adjacency Matrix Perturbation
  • proof
  • Lemma 2: Degree Matrix Change
  • proof
  • Lemma 3: Bounded Change in Normalized Adjacency Matrix
  • proof
  • Lemma 4: GNN Output Difference Bound
  • ...and 7 more