Table of Contents
Fetching ...

Self-Supervised Graph Representation Learning via Global Context Prediction

Zhen Peng, Yixiang Dong, Minnan Luo, Xiao-Ming Wu, Qinghua Zheng

TL;DR

This work tackles unsupervised graph representation learning by exploiting a free supervisory signal inherent to graph structure. It introduces S$^{2}$GRL, a self-supervised framework that learns node embeddings by predicting the relative contextual position between node pairs based on hop-count-derived global context. The approach uses a graph encoder with graph convolutional layers and a hop-based classification objective, partitioning context into major hop categories to guide learning. Empirical results across node classification, clustering, and link prediction demonstrate competitive performance with state-of-the-art unsupervised methods and, in some cases, supervision-based models, highlighting the potential of hop-count supervision for scalable, unlabeled graph learning.

Abstract

To take full advantage of fast-growing unlabeled networked data, this paper introduces a novel self-supervised strategy for graph representation learning by exploiting natural supervision provided by the data itself. Inspired by human social behavior, we assume that the global context of each node is composed of all nodes in the graph since two arbitrary entities in a connected network could interact with each other via paths of varying length. Based on this, we investigate whether the global context can be a source of free and effective supervisory signals for learning useful node representations. Specifically, we randomly select pairs of nodes in a graph and train a well-designed neural net to predict the contextual position of one node relative to the other. Our underlying hypothesis is that the representations learned from such within-graph context would capture the global topology of the graph and finely characterize the similarity and differentiation between nodes, which is conducive to various downstream learning tasks. Extensive benchmark experiments including node classification, clustering, and link prediction demonstrate that our approach outperforms many state-of-the-art unsupervised methods and sometimes even exceeds the performance of supervised counterparts.

Self-Supervised Graph Representation Learning via Global Context Prediction

TL;DR

This work tackles unsupervised graph representation learning by exploiting a free supervisory signal inherent to graph structure. It introduces SGRL, a self-supervised framework that learns node embeddings by predicting the relative contextual position between node pairs based on hop-count-derived global context. The approach uses a graph encoder with graph convolutional layers and a hop-based classification objective, partitioning context into major hop categories to guide learning. Empirical results across node classification, clustering, and link prediction demonstrate competitive performance with state-of-the-art unsupervised methods and, in some cases, supervision-based models, highlighting the potential of hop-count supervision for scalable, unlabeled graph learning.

Abstract

To take full advantage of fast-growing unlabeled networked data, this paper introduces a novel self-supervised strategy for graph representation learning by exploiting natural supervision provided by the data itself. Inspired by human social behavior, we assume that the global context of each node is composed of all nodes in the graph since two arbitrary entities in a connected network could interact with each other via paths of varying length. Based on this, we investigate whether the global context can be a source of free and effective supervisory signals for learning useful node representations. Specifically, we randomly select pairs of nodes in a graph and train a well-designed neural net to predict the contextual position of one node relative to the other. Our underlying hypothesis is that the representations learned from such within-graph context would capture the global topology of the graph and finely characterize the similarity and differentiation between nodes, which is conducive to various downstream learning tasks. Extensive benchmark experiments including node classification, clustering, and link prediction demonstrate that our approach outperforms many state-of-the-art unsupervised methods and sometimes even exceeds the performance of supervised counterparts.

Paper Structure

This paper contains 20 sections, 8 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: A toy example of our self-supervised task involving predicting the relative contextual position of one node to another.
  • Figure 2: Results of an unsupervised baseline formulated by stacking varying number of graph convolutional layers on node classification (left) and clustering (right) tasks. Initially the performance improves as the number of layer increases. But more layers lead to over-smoothing and performance decay.
  • Figure 3: The proposed self-supervised framework S$^{2}$GRL for learning node representations over graph-structured data.
  • Figure 4: (a-b) t-SNE plots of node pairs w.r.t. topological distance on Cora. The color corresponds to the length of the shortest path between pairs of nodes. Vectors learned by S$^{2}$GRL present better structural properties. (c-e) Visualization of the learned representations on Cora.

Theorems & Definitions (1)

  • Definition