Generalized Graph Transformer Variational Autoencoder
Siddhant Karki
TL;DR
The paper tackles link prediction on graphs by replacing traditional message passing with a Generalized Graph Transformer Variational Autoencoder (GGT-VAE) that uses Laplacian positional encodings and global self-attention to learn a probabilistic latent space. It demonstrates that transformer-based encoders can capture both local and global graph structure without neighborhood aggregation, achieving competitive ROC-AUC and AP on Planetoid datasets (Cora and Citeseer). Through qualitative attention maps and quantitative metrics like globality, the authors show how the model reasons over long-range relationships, with ablations confirming robust performance across reasonable hyperparameter ranges. The work highlights the potential of combining graph transformers with variational inference for scalable graph generation and link prediction, with future directions toward full graph generation, molecular design, and multimodal conditioning.
Abstract
Graph link prediction has long been a central problem in graph representation learning in both network analysis and generative modeling. Recent progress in deep learning has introduced increasingly sophisticated architectures for capturing relational dependencies within graph-structured data. In this work, we propose the Generalized Graph Transformer Variational Autoencoder (GGT-VAE). Our model integrates Generalized Graph Transformer Architecture with Variational Autoencoder framework for link prediction. Unlike prior GraphVAE, GCN, or GNN approaches, GGT-VAE leverages transformer style global self-attention mechanism along with laplacian positional encoding to model structural patterns across nodes into a latent space without relying on message passing. Experimental results on several benchmark datasets demonstrate that GGT-VAE consistently achieves above-baseline performance in terms of ROC-AUC and Average Precision. To the best of our knowledge, this is among the first studies to explore graph structure generation using a generalized graph transformer backbone in a variational framework.
