Table of Contents
Fetching ...

Graph Self-Supervised Learning with Learnable Structural and Positional Encodings

Asiri Wijesinghe, Hao Zhu, Piotr Koniusz

TL;DR

Graph Self-Supervised Learning (GSSL) often fails to capture global topology due to limitations in conventional GNN expressiveness and SSL's focus on final graph representations. The paper introduces GenHopNet, a $k$-hop message-passing GNN, and StructPosGSSL, a topology-aware SSL framework that jointly leverages structural and Laplacian-based positional encodings to learn topology-sensitive representations. The authors prove GenHopNet surpasses the $1$-WL test in expressiveness and demonstrate that StructPosGSSL effectively distinguishes non-isomorphic graphs with similar local patterns while remaining computationally scalable. Empirically, StructPosGSSL achieves state-of-the-art performance across small, large, and synthetic graph benchmarks, with extensive ablations highlighting the complementary roles of closed-walks, structural and positional encodings, and the NT-Xent/VICReg loss combination, underscoring its practical impact on topology-aware graph learning.

Abstract

Traditional Graph Self-Supervised Learning (GSSL) struggles to capture complex structural properties well. This limitation stems from two main factors: (1) the inadequacy of conventional Graph Neural Networks (GNNs) in representing sophisticated topological features, and (2) the focus of self-supervised learning solely on final graph representations. To address these issues, we introduce \emph{GenHopNet}, a GNN framework that integrates a $k$-hop message-passing scheme, enhancing its ability to capture local structural information without explicit substructure extraction. We theoretically demonstrate that \emph{GenHopNet} surpasses the expressiveness of the classical Weisfeiler-Lehman (WL) test for graph isomorphism. Furthermore, we propose a structural- and positional-aware GSSL framework that incorporates topological information throughout the learning process. This approach enables the learning of representations that are both sensitive to graph topology and invariant to specific structural and feature augmentations. Comprehensive experiments on graph classification datasets, including those designed to test structural sensitivity, show that our method consistently outperforms the existing approaches and maintains computational efficiency. Our work significantly advances GSSL's capability in distinguishing graphs with similar local structures but different global topologies.

Graph Self-Supervised Learning with Learnable Structural and Positional Encodings

TL;DR

Graph Self-Supervised Learning (GSSL) often fails to capture global topology due to limitations in conventional GNN expressiveness and SSL's focus on final graph representations. The paper introduces GenHopNet, a -hop message-passing GNN, and StructPosGSSL, a topology-aware SSL framework that jointly leverages structural and Laplacian-based positional encodings to learn topology-sensitive representations. The authors prove GenHopNet surpasses the -WL test in expressiveness and demonstrate that StructPosGSSL effectively distinguishes non-isomorphic graphs with similar local patterns while remaining computationally scalable. Empirically, StructPosGSSL achieves state-of-the-art performance across small, large, and synthetic graph benchmarks, with extensive ablations highlighting the complementary roles of closed-walks, structural and positional encodings, and the NT-Xent/VICReg loss combination, underscoring its practical impact on topology-aware graph learning.

Abstract

Traditional Graph Self-Supervised Learning (GSSL) struggles to capture complex structural properties well. This limitation stems from two main factors: (1) the inadequacy of conventional Graph Neural Networks (GNNs) in representing sophisticated topological features, and (2) the focus of self-supervised learning solely on final graph representations. To address these issues, we introduce \emph{GenHopNet}, a GNN framework that integrates a -hop message-passing scheme, enhancing its ability to capture local structural information without explicit substructure extraction. We theoretically demonstrate that \emph{GenHopNet} surpasses the expressiveness of the classical Weisfeiler-Lehman (WL) test for graph isomorphism. Furthermore, we propose a structural- and positional-aware GSSL framework that incorporates topological information throughout the learning process. This approach enables the learning of representations that are both sensitive to graph topology and invariant to specific structural and feature augmentations. Comprehensive experiments on graph classification datasets, including those designed to test structural sensitivity, show that our method consistently outperforms the existing approaches and maintains computational efficiency. Our work significantly advances GSSL's capability in distinguishing graphs with similar local structures but different global topologies.

Paper Structure

This paper contains 24 sections, 6 theorems, 6 equations, 5 figures, 12 tables.

Key Result

Theorem 1

The following statement is true: (a) If $\sum_{k} \mathbf{A}^k_{vv} \simeq_{Cycle} \sum_{k} \mathbf{A}^k_{v'v'}$, then $\sum_{k} \mathbf{A}^k_{vv} \simeq_{ClosedWalk} \sum_{k} \mathbf{A}^k_{v'v'}$; but not vice versa.

Figures (5)

  • Figure 1: ( a) A high-level overview of the architecture of StructPosGSSL ($G$ is an input graph with two views $(\mathbf{A},\mathbf{X}')$ and $(\mathbf{A}',\mathbf{X})$). Our design comprises three main components: (i) a structural encoder ( Struct. Enc.) that generates structural embeddings ($\mathbf{h}_{SE}$ and $\mathbf{h}_{SE}'$) for nodes based on their local structural properties; (ii) a positional encoder ( Pos. Enc.) that generates positional embeddings ($\mathbf{h}_{PE}$ and $\mathbf{h}_{PE}'$) for nodes; and (iii) a node aggregation layer, $\sum$, acting over all node embeddings to generate a global graph representation. Moreover, $\text{MLP}_{\phi}$ and $\text{MLP}_{\theta}$ are two projection heads for node representations and graph representations, whereas $\oplus$ is concatenation. ( b) The real-world graph structures of two molecules, Decalin and Bicyclopentyl. While standard Graph SSL frameworks cannot distinguish between these molecular structures, our model can differentiate them.
  • Figure 2: A high-level overview of $k=2, \dots, 4$ closed walks, where the blue node is the source node. For $k=2$, the walk can traverse between nodes multiple times, forming a non-cyclic path. For $k=3$ and $k=4$, the walks can capture closed cycles of length 3 and 4, respectively, effectively identifying cyclic structures in the graph.
  • Figure 3: A pair of non-isomorphic graphs where positional encoding (with a dimension of 2) and EB attributes outperform the 1-WL test in distinguishing between graphs $G_1$ and $G_2$, allowing for the detection of structural differences that the 1-WL test fails to capture.
  • Figure 4: A pair of non-isomorphic graphs where the colored square box on each node represents the feature representations derived from closed-walk information. The two middle nodes (colored gray) in graphs $G_1$ and $G_2$ cannot be distinguished using only closed-walk information (up to $k=3$), as they receive the same representation (colored blue). However, by incorporating positional information along with EB attributes, we can successfully distinguish these nodes.
  • Figure 5: Accuracy (%) of StructPosGSSL-SA under different $\alpha$ values.

Theorems & Definitions (14)

  • Definition 1
  • Definition 2
  • Theorem 1
  • Theorem 2
  • Lemma 1
  • Lemma 2
  • Corollary 1
  • Theorem 3
  • proof
  • proof
  • ...and 4 more