Instance Optimality in PageRank Centrality Estimation
Mikkel Thorup, Hanzhi Wang
TL;DR
The paper examines the problem of estimating a vertex's PageRank centrality $\pi(t)$ in directed graphs using a simple, adaptive bidirectional approach. It defines and proves instance-optimality for an adaptive Bidirectional-PPR variant on a broad class of sparse graphs, with a matching polylogarithmic lower-bound barrier via a sophisticated graph-augmentation construction. It also identifies a natural limitation: on mostly-degree-$n$ graphs the algorithm is not instance-optimal, and proposes an instance-smart strategy that detects such graphs and runs in polylog time. Overall, the work advances sublinear PageRank estimation by showing a practical, near-optimal algorithm is effective for most graphs encountered in practice, and it provides a nuanced boundary between sparse/typical graphs and dense, highly regular graphs.
Abstract
We study an adaptive variant of a simple, classic algorithm for estimating a vertex's PageRank centrality within a constant relative error, with constant probability. We show that this algorithm is instance-optimal up to a polylogarithmic factor for any directed graph of order $n$ whose maximal in- and out-degrees are at most a constant fraction of $n$. The instance-optimality also extends to graphs in which up to a polylogarithmic number of vertices have unbounded degree, thereby covering all sparse graphs with $\widetilde{O}(n)$ edges. Finally, we provide a counterexample showing that the algorithm is not instance-optimal for graphs with degrees mostly equal to $n$.
