Massively Parallel Algorithms for Approximate Shortest Paths
Michal Dory, Shaked Matar
TL;DR
The paper addresses sublinear-round, approximate shortest-path computation in the MPC model for unweighted graphs by combining two core constructs: limited-scale hopsets for short distances and near-additive emulators for long distances. It delivers a near-linear-memory, randomized framework that achieves (1+ε)-approximate SSSP in poly(log log n) rounds and builds a distance oracle with (1+ε)(2k-1) guarantees, queryable in O(1) time. A unified general framework (sampling, edge selection) yields near-exact hopsets and emulators, with careful memory-management enabling sublinear MPC implementations and heterogeneous MPC settings (one near-linear machine plus sublinear peers). The approach improves over prior polylog-round or high-additive-error methods, and it supports APSP via a two-structure scheme (limited-scale distance sketches plus emulators) and flexible memory-speed tradeoffs via spanner-based optimizations. Overall, the work advances efficient, scalable distance computation in MPC, with practical implications for large-scale graph analytics in MapReduce-like environments.
Abstract
We present fast algorithms for approximate shortest paths in the massively parallel computation (MPC) model. We provide randomized algorithms that take $poly(\log{\log{n}})$ rounds in the near-linear memory MPC model. Our results are for unweighted undirected graphs with $n$ vertices and $m$ edges. Our first contribution is a $(1+ε)$-approximation algorithm for Single-Source Shortest Paths (SSSP) that takes $poly(\log{\log{n}})$ rounds in the near-linear MPC model, where the memory per machine is $\tilde{O}(n)$ and the total memory is $\tilde{O}(mn^ρ)$, where $ρ$ is a small constant. Our second contribution is a distance oracle that allows to approximate the distance between any pair of vertices. The distance oracle is constructed in $poly(\log{\log{n}})$ rounds and allows to query a $(1+ε)(2k-1)$-approximate distance between any pair of vertices $u$ and $v$ in $O(1)$ additional rounds. The algorithm is for the near-linear memory MPC model with total memory of size $\tilde{O}((m+n^{1+ρ})n^{1/k})$, where $ρ$ is a small constant. While our algorithms are for the near-linear MPC model, in fact they only use one machine with $\tilde{O}(n)$ memory, where the rest of machines can have sublinear memory of size $O(n^γ)$ for a small constant $γ< 1$. All previous algorithms for approximate shortest paths in the near-linear MPC model either required $Ω(\log{n})$ rounds or had an $Ω(\log{n})$ approximation. Our approach is based on fast construction of near-additive emulators, limited-scale hopsets and limited-scale distance sketches that are tailored for the MPC model. While our end-results are for the near-linear MPC model, many of the tools we construct such as hopsets and emulators are constructed in the more restricted sublinear MPC model.
