Graph-level Representation Learning with Joint-Embedding Predictive Architectures
Geri Skenderi, Hang Li, Jiliang Tang, Marco Cristani
TL;DR
Graph-JEPA introduces a graph-level Joint-Embedding Predictive Architecture that learns semantic representations by predicting latent embeddings of masked subgraphs from a context subgraph, operating entirely in latent space without negative samples. The method partitions graphs into subgraphs, encodes them with GNNs, and uses a simple predictor to locate subgraph targets on a 2D unit hyperbola, thereby inducing a hierarchical, hyperbolic latent structure. Empirical results on diverse graph-classification and regression tasks show competitive or state-of-the-art performance with favorable training efficiency, and ablations demonstrate the benefits of hyperbolic latent prediction, RWSE positional embeddings, and structured partitioning. The work highlights the practicality and effectiveness of latent self-predictive SSL for graphs and points to future extensions to node/edge-level tasks and theoretical analysis of latent geometry.
Abstract
Joint-Embedding Predictive Architectures (JEPAs) have recently emerged as a novel and powerful technique for self-supervised representation learning. They aim to learn an energy-based model by predicting the latent representation of a target signal y from the latent representation of a context signal x. JEPAs bypass the need for negative and positive samples, traditionally required by contrastive learning while avoiding the overfitting issues associated with generative pretraining. In this paper, we show that graph-level representations can be effectively modeled using this paradigm by proposing a Graph Joint-Embedding Predictive Architecture (Graph-JEPA). In particular, we employ masked modeling and focus on predicting the latent representations of masked subgraphs starting from the latent representation of a context subgraph. To endow the representations with the implicit hierarchy that is often present in graph-level concepts, we devise an alternative prediction objective that consists of predicting the coordinates of the encoded subgraphs on the unit hyperbola in the 2D plane. Through multiple experimental evaluations, we show that Graph-JEPA can learn highly semantic and expressive representations, as shown by the downstream performance in graph classification, regression, and distinguishing non-isomorphic graphs. The code is available at https://github.com/geriskenderi/graph-jepa.
