Table of Contents
Fetching ...

Instance Optimality in PageRank Centrality Estimation

Mikkel Thorup, Hanzhi Wang

TL;DR

The paper examines the problem of estimating a vertex's PageRank centrality $\pi(t)$ in directed graphs using a simple, adaptive bidirectional approach. It defines and proves instance-optimality for an adaptive Bidirectional-PPR variant on a broad class of sparse graphs, with a matching polylogarithmic lower-bound barrier via a sophisticated graph-augmentation construction. It also identifies a natural limitation: on mostly-degree-$n$ graphs the algorithm is not instance-optimal, and proposes an instance-smart strategy that detects such graphs and runs in polylog time. Overall, the work advances sublinear PageRank estimation by showing a practical, near-optimal algorithm is effective for most graphs encountered in practice, and it provides a nuanced boundary between sparse/typical graphs and dense, highly regular graphs.

Abstract

We study an adaptive variant of a simple, classic algorithm for estimating a vertex's PageRank centrality within a constant relative error, with constant probability. We show that this algorithm is instance-optimal up to a polylogarithmic factor for any directed graph of order $n$ whose maximal in- and out-degrees are at most a constant fraction of $n$. The instance-optimality also extends to graphs in which up to a polylogarithmic number of vertices have unbounded degree, thereby covering all sparse graphs with $\widetilde{O}(n)$ edges. Finally, we provide a counterexample showing that the algorithm is not instance-optimal for graphs with degrees mostly equal to $n$.

Instance Optimality in PageRank Centrality Estimation

TL;DR

The paper examines the problem of estimating a vertex's PageRank centrality in directed graphs using a simple, adaptive bidirectional approach. It defines and proves instance-optimality for an adaptive Bidirectional-PPR variant on a broad class of sparse graphs, with a matching polylogarithmic lower-bound barrier via a sophisticated graph-augmentation construction. It also identifies a natural limitation: on mostly-degree- graphs the algorithm is not instance-optimal, and proposes an instance-smart strategy that detects such graphs and runs in polylog time. Overall, the work advances sublinear PageRank estimation by showing a practical, near-optimal algorithm is effective for most graphs encountered in practice, and it provides a nuanced boundary between sparse/typical graphs and dense, highly regular graphs.

Abstract

We study an adaptive variant of a simple, classic algorithm for estimating a vertex's PageRank centrality within a constant relative error, with constant probability. We show that this algorithm is instance-optimal up to a polylogarithmic factor for any directed graph of order whose maximal in- and out-degrees are at most a constant fraction of . The instance-optimality also extends to graphs in which up to a polylogarithmic number of vertices have unbounded degree, thereby covering all sparse graphs with edges. Finally, we provide a counterexample showing that the algorithm is not instance-optimal for graphs with degrees mostly equal to .

Paper Structure

This paper contains 32 sections, 21 theorems, 85 equations, 1 figure, 6 algorithms.

Key Result

Theorem 2.1

Consider any directed graph $G$ with maximal in- and out-degrees upper bounded by $(1-\varepsilon)n$ for some $\varepsilon\in[0,1]$. For any $r\in [0,1]$, suppose there exists an algorithm $A$ that estimates $\pi(t)$ in expected time $O(\min\left(T_{r}, r/\pi(t)\right) \varepsilon^2/\log^{3/2} n)$. where the probability $\Pr_R$ is taken over the choice of the random seed $R$ used by the algorithm

Figures (1)

  • Figure 1: Sketch of the constructions of the graphs $G^-$ and $G^+$

Theorems & Definitions (39)

  • Theorem 2.1
  • Lemma 2.2
  • Lemma 2.3
  • Lemma 2.4
  • Lemma 2.5
  • proof
  • Lemma 2.6
  • Lemma 2.7
  • Lemma 2.8
  • Lemma 2.9
  • ...and 29 more