Table of Contents
Fetching ...

Near-Optimality for Single-Source Personalized PageRank

Xinpeng Jiang, Haoyu Liu, Siqiang Luo, Xiaokui Xiao

Abstract

The \emph{Single-Source Personalized PageRank} (SSPPR) query is central to graph OLAP, measuring the probability $π(s,t)$ that an $α$-decay random walk from node $s$ terminates at $t$. Despite decades of research, a significant gap remains between upper and lower bounds for its computational complexity. Existing upper bounds are $O\left(\min\left(\frac{\log(1/ε)}{ε^2}, \frac{\sqrt{m \log n}}ε, m \log \frac{1}ε\right)\right)$ for SSPPR-A and $O\left(\min\left(\frac{\log(1/n)}δ, \sqrt{m \log(n/δ)}, m \log \left(\frac{\log(n)}{mδ}\right)\right)\right)$ for SSPPR-R, with trivial lower bounds of $Ω(\min(n,1/ε))$ and $Ω(\min(n,1/δ))$. This work narrows or closes this gap. We improve the upper bounds for SSPPR-A and SSPPR-R to $O\left(\frac{1}{ε^2}\right)$ and $O\left(\min\left(\frac{\log(1/δ)}δ, m + n \log(n) \log \left(\frac{\log(n)}{mδ}\right)\right)\right)$, respectively, offering improvements by factors of $\log(1/ε)$ and $\log\left(\frac{\log(n)}{mδ}\right)$. On the lower bound side, we establish stronger results: $Ω(\min(m, 1/ε^2))$ for SSPPR-A and $Ω(\min(m, \frac{\log(1/δ)}δ))$ for SSPPR-R, strengthening theoretical foundations. Our upper and lower bounds for SSPPR-R coincide for graphs with $m \in Ω(n \log^2 n)$ and any threshold $δ, 1/δ\in O(\text{poly}(n))$, achieving theoretical optimality in most graph regimes. The SSPPR-A query attains partial optimality for large error thresholds, matching our new lower bound. This is the first optimal result for SSPPR queries. Our techniques generalize to the Single-Target Personalized PageRank (STPPR) query, improving its lower bound from $Ω(\min(n, 1/δ))$ to $Ω(\min(m, \frac{n}δ \log n))$, matching the upper bound and revealing its optimality.

Near-Optimality for Single-Source Personalized PageRank

Abstract

The \emph{Single-Source Personalized PageRank} (SSPPR) query is central to graph OLAP, measuring the probability that an -decay random walk from node terminates at . Despite decades of research, a significant gap remains between upper and lower bounds for its computational complexity. Existing upper bounds are for SSPPR-A and for SSPPR-R, with trivial lower bounds of and . This work narrows or closes this gap. We improve the upper bounds for SSPPR-A and SSPPR-R to and , respectively, offering improvements by factors of and . On the lower bound side, we establish stronger results: for SSPPR-A and for SSPPR-R, strengthening theoretical foundations. Our upper and lower bounds for SSPPR-R coincide for graphs with and any threshold , achieving theoretical optimality in most graph regimes. The SSPPR-A query attains partial optimality for large error thresholds, matching our new lower bound. This is the first optimal result for SSPPR queries. Our techniques generalize to the Single-Target Personalized PageRank (STPPR) query, improving its lower bound from to , matching the upper bound and revealing its optimality.

Paper Structure

This paper contains 33 sections, 37 theorems, 112 equations, 4 figures, 2 tables, 5 algorithms.

Key Result

Theorem 1.4

We present DistWalks (Algorithm algo:dist_walks), which solves the SSPPR-R query within $\mathop{\mathrm{\mathcal{O}}}\nolimits\left(m+n\log(n)\log(\frac{\log(n)}{m\delta})\right)$ queries in expectation.

Figures (4)

  • Figure 1: Instance $\mathrm{U}\to r$-padded Instance $\mathrm{U}(r)$.
  • Figure 2: An example for generating graph distribution $\Sigma(n, D, d)=\Sigma(2,2,1)$.
  • Figure 3: Graph examples for lower bounds. (a) denotes the example of generating graph distribution $\mathcal{G}(D,\mathbf{b})$ in SSPPR-A query and (b) represents the graph instance $\mathcal{U}^{-1}(r)$ derived from $\mathcal{U}(r)$ in STPPR query.
  • Figure 4: Illustration of a $2$-Lift.

Theorems & Definitions (47)

  • Definition 1.1
  • Definition 1.2
  • Definition 1.3
  • Theorem 1.4: Improved upper bound of SSPPR-R
  • Theorem 1.5: Improved upper bound of SSPPR
  • Theorem 1.6: Improved lower bound of SSPPR-R
  • Theorem 1.7: Improved Lower Bound of STPPR Query
  • Theorem 1.8: Improved Lower Bound of SSPPR-A Query
  • Definition 2.1: multigraph and edge multiplicity martens2022complexity
  • Definition 2.2: Arc–Centric Graph Query Model goldreich1998propertygoldreich1997property
  • ...and 37 more