Faster Graph Embeddings via Coarsening
Matthew Fahrbach, Gramoz Goranci, Richard Peng, Sushant Sachdeva, Chi Wang
TL;DR
This work tackles the scalability problem of graph embeddings on large graphs by introducing two vertex-sparsification methods based on Schur complements. The SchurComplement and RandomContraction algorithms reduce the graph while exactly or in expectation preserving the embedding structure for a set of terminal vertices, enabling near-linear-time preprocessing. The authors prove that the embedding matrices, particularly under NetMF, are preserved on the terminals as the walk length grows, and demonstrate substantial speedups and competitive accuracy on multi-label vertex classification and link prediction benchmarks. The results show practical benefits for large-scale networks, making high-quality embeddings feasible for subsets of nodes without sacrificing predictive performance.
Abstract
Graph embeddings are a ubiquitous tool for machine learning tasks, such as node classification and link prediction, on graph-structured data. However, computing the embeddings for large-scale graphs is prohibitively inefficient even if we are interested only in a small subset of relevant vertices. To address this, we present an efficient graph coarsening approach, based on Schur complements, for computing the embedding of the relevant vertices. We prove that these embeddings are preserved exactly by the Schur complement graph that is obtained via Gaussian elimination on the non-relevant vertices. As computing Schur complements is expensive, we give a nearly-linear time algorithm that generates a coarsened graph on the relevant vertices that provably matches the Schur complement in expectation in each iteration. Our experiments involving prediction tasks on graphs demonstrate that computing embeddings on the coarsened graph, rather than the entire graph, leads to significant time savings without sacrificing accuracy.
