Landmark-Based Node Representations for Shortest Path Distance Approximations in Random Graphs
My Le, Luana Ruiz, Souvik Dhara
TL;DR
This work studies landmark-based node embeddings aimed at preserving shortest-path distances, inspired by Bourgain's metric embeddings. It shows that on Erdos-Renyi random graphs, the embedding dimension required for low-distortion distance approximations can be significantly smaller than worst-case bounds, with concrete rates depending on tunable parameters. A GNN-augmented variant is proposed to learn landmark distances, reducing explicit path computations and enabling transferability to larger graphs and real networks, with empirical evidence that GNN-based bounds can outperform exact landmark methods. Together, the results deliver both average-case theoretical insights and practical scalable methods for distance-aware graph representations on large-scale networks.
Abstract
Learning node representations is a fundamental problem in graph machine learning. While existing embedding methods effectively preserve local similarity measures, they often fail to capture global functions like graph distances. Inspired by Bourgain's seminal work on Hilbert space embeddings of metric spaces (1985), we study the performance of local distance-preserving node embeddings. Known as landmark-based algorithms, these embeddings approximate pairwise distances by computing shortest paths from a small subset of reference nodes called landmarks. Our main theoretical contribution shows that random graphs, such as Erdos-Renyi random graphs, require lower dimensions in landmark-based embeddings compared to worst-case graphs. Empirically, we demonstrate that the GNN-based approximations for the distances to landmarks generalize well to larger real-world networks, offering a scalable and transferable alternative for graph representation learning.
