Edge-Parallel Graph Encoder Embedding
Ariel Lubonja, Cencheng Shen, Carey Priebe, Randal Burns
TL;DR
This work targets efficient graph embedding by accelerating One-Hot Graph Encoder Embedding (GEE) through edge-parallelism in the Ligra graph engine. The core idea is to reformulate GEE as an edge-map program with a frontier covering all nodes and to use lock-free atomic updates to resolve potential write conflicts, achieving substantial speedups over prior serial and JIT-compiled implementations. Empirical results on large-scale graphs demonstrate up to 500x speedup versus the original implementation and 17x versus Numba, enabling embedding of graphs with up to 1.8B edges in around 6.5 seconds and showing strong scalability on multi-core platforms. The approach combines algorithmic reformulation, parallel runtime engineering, and memory-aware execution to make practical, high-quality graph embeddings feasible at billion-edge scales.
Abstract
New algorithms for embedding graphs have reduced the asymptotic complexity of finding low-dimensional representations. One-Hot Graph Encoder Embedding (GEE) uses a single, linear pass over edges and produces an embedding that converges asymptotically to the spectral embedding. The scaling and performance benefits of this approach have been limited by a serial implementation in an interpreted language. We refactor GEE into a parallel program in the Ligra graph engine that maps functions over the edges of the graph and uses lock-free atomic instrutions to prevent data races. On a graph with 1.8B edges, this results in a 500 times speedup over the original implementation and a 17 times speedup over a just-in-time compiled version.
