Table of Contents
Fetching ...

Bootstrap Latents of Nodes and Neighbors for Graph Self-Supervised Learning

Yunhui Liu, Huaisong Zhang, Tieke He, Tao Zheng, Jianhua Zhao

TL;DR

This work extends negative-free graph self-supervised learning by exploiting graph homophily through neighbor-based positive pairs. It introduces BLNN, which adds node–neighbor positives to the BGRL framework and employs a cross-attention module to softly weight neighbor contributions, reducing class collision from false positives. Empirical results across five datasets and three downstream tasks show improved intra-class compactness and state-of-the-art performance, with robust hyperparameter behavior and scalable complexity. The approach offers a practical, scalable way to leverage local label-consistent structure in graphs for improved representation learning.

Abstract

Contrastive learning is a significant paradigm in graph self-supervised learning. However, it requires negative samples to prevent model collapse and learn discriminative representations. These negative samples inevitably lead to heavy computation, memory overhead and class collision, compromising the representation learning. Recent studies present that methods obviating negative samples can attain competitive performance and scalability enhancements, exemplified by bootstrapped graph latents (BGRL). However, BGRL neglects the inherent graph homophily, which provides valuable insights into underlying positive pairs. Our motivation arises from the observation that subtly introducing a few ground-truth positive pairs significantly improves BGRL. Although we can't obtain ground-truth positive pairs without labels under the self-supervised setting, edges in the graph can reflect noisy positive pairs, i.e., neighboring nodes often share the same label. Therefore, we propose to expand the positive pair set with node-neighbor pairs. Subsequently, we introduce a cross-attention module to predict the supportiveness score of a neighbor with respect to the anchor node. This score quantifies the positive support from each neighboring node, and is encoded into the training objective. Consequently, our method mitigates class collision from negative and noisy positive samples, concurrently enhancing intra-class compactness. Extensive experiments are conducted on five benchmark datasets and three downstream task node classification, node clustering, and node similarity search. The results demonstrate that our method generates node representations with enhanced intra-class compactness and achieves state-of-the-art performance.

Bootstrap Latents of Nodes and Neighbors for Graph Self-Supervised Learning

TL;DR

This work extends negative-free graph self-supervised learning by exploiting graph homophily through neighbor-based positive pairs. It introduces BLNN, which adds node–neighbor positives to the BGRL framework and employs a cross-attention module to softly weight neighbor contributions, reducing class collision from false positives. Empirical results across five datasets and three downstream tasks show improved intra-class compactness and state-of-the-art performance, with robust hyperparameter behavior and scalable complexity. The approach offers a practical, scalable way to leverage local label-consistent structure in graphs for improved representation learning.

Abstract

Contrastive learning is a significant paradigm in graph self-supervised learning. However, it requires negative samples to prevent model collapse and learn discriminative representations. These negative samples inevitably lead to heavy computation, memory overhead and class collision, compromising the representation learning. Recent studies present that methods obviating negative samples can attain competitive performance and scalability enhancements, exemplified by bootstrapped graph latents (BGRL). However, BGRL neglects the inherent graph homophily, which provides valuable insights into underlying positive pairs. Our motivation arises from the observation that subtly introducing a few ground-truth positive pairs significantly improves BGRL. Although we can't obtain ground-truth positive pairs without labels under the self-supervised setting, edges in the graph can reflect noisy positive pairs, i.e., neighboring nodes often share the same label. Therefore, we propose to expand the positive pair set with node-neighbor pairs. Subsequently, we introduce a cross-attention module to predict the supportiveness score of a neighbor with respect to the anchor node. This score quantifies the positive support from each neighboring node, and is encoded into the training objective. Consequently, our method mitigates class collision from negative and noisy positive samples, concurrently enhancing intra-class compactness. Extensive experiments are conducted on five benchmark datasets and three downstream task node classification, node clustering, and node similarity search. The results demonstrate that our method generates node representations with enhanced intra-class compactness and achieves state-of-the-art performance.
Paper Structure (22 sections, 9 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 9 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: Overview of our proposed BLNN method. Given a graph, we first generate two different views using augmentations $t^1,t^2$. From these, we use encoders $f_{\theta}, f_\phi$ to form online and target node representations $\boldsymbol{H}^1, \boldsymbol{H}^2$. They are then fed into the attention module to compute the supportiveness $w_j$ of the neighbor $v_j$ w.r.t. the anchor node $v_i$. The predictor $p_\theta$ uses $\boldsymbol{H}^1$ to form a prediction $\boldsymbol{Z}^1$ of the target $\boldsymbol{H}^2$. The final objective is computed as a combination of the alignment of node-itself pairs and the supportiveness-weighted alignment of node-neighbor pairs. Note that the alignment is achieved by maximizing the cosine similarity between corresponding rows of $\boldsymbol{Z}^1$ and $\boldsymbol{H}^2$, flowing gradients only through $\boldsymbol{Z}^1$. The target parameters $\phi$ are updated as an exponentially moving average of $\theta$.
  • Figure 2: Empirical studies on WikiCS, Computer and CS. "noisy pos" indicates raw node-neighbor pairs in the input graph, while "clean pos" indicates clean node-neighbor pairs that all are intra-class pairs.
  • Figure 3: Case study to verify the efficacy of our attention module.
  • Figure 4: Visualization of the impact of $\tau$ on node classification.
  • Figure 5: t-SNE visualization and intra-class compactness of node representations on Computer. '$(*)$' indicates the mean intra-class pair-wise cosine similarity.