Rethinking Link Prediction for Directed Graphs
Mingguo He, Yuhe Guo, Yanping Zheng, Zhewei Wei, Stephan Günnemann, Xiaokui Xiao
TL;DR
Rethinking Link Prediction for Directed Graphs formalizes directed link prediction within a unified encoder–decoder framework and demonstrates that dual embeddings ${\mathbf{s}}_u$, ${\mathbf{t}}_u$ more effectively capture directionality than single embeddings. It introduces DirLinkBench to standardize evaluation across seven real-world datasets and multiple metrics, revealing that simple models like DiGAE can outperform more complex approaches when decoders and losses are chosen appropriately. Building on these insights, the paper reinterprets DiGAE as a GCN on an undirected bipartite graph and presents SDGAE, a spectral directed graph auto-encoder that learns arbitrary polynomial filters with complexity $O(2K m d)$, achieving state-of-the-art average performance on DirLinkBench. The work also analyzes the impact of feature inputs, loss functions, decoder design, degree distributions, and negative sampling, and outlines open challenges, including developing more expressive decoders for complex-valued methods and better preserving asymmetry in directed graphs.
Abstract
Link prediction for directed graphs is a crucial task with diverse real-world applications. Recent advances in embedding methods and Graph Neural Networks (GNNs) have shown promising improvements. However, these methods often lack a thorough analysis of their expressiveness and suffer from effective benchmarks for a fair evaluation. In this paper, we propose a unified framework to assess the expressiveness of existing methods, highlighting the impact of dual embeddings and decoder design on directed link prediction performance. To address limitations in current benchmark setups, we introduce DirLinkBench, a robust new benchmark with comprehensive coverage, standardized evaluation, and modular extensibility. The results on DirLinkBench show that current methods struggle to achieve strong performance, while DiGAE outperforms other baselines overall. We further revisit DiGAE theoretically, showing its graph convolution aligns with GCN on an undirected bipartite graph. Inspired by these insights, we propose a novel Spectral Directed Graph Auto-Encoder SDGAE that achieves state-of-the-art average performance on DirLinkBench. Finally, we analyze key factors influencing directed link prediction and highlight open challenges in this field.
