Approximating Single-Source Personalized PageRank with Absolute Error Guarantees

Zhewei Wei; Ji-Rong Wen; Mingji Yang

Approximating Single-Source Personalized PageRank with Absolute Error Guarantees

Zhewei Wei, Ji-Rong Wen, Mingji Yang

TL;DR

This work studies the classic Single-Source PPR query, and proposes an algorithm that provides approximations with absolute error guarantees with high probability, achieving an expected complexity of $\widetilde{O}\left(\sqrt{\sum_{t\in V}\pi(s,t)/d(t)}\big/\varepsilon_d\right)$.

Abstract

Personalized PageRank (PPR) is an extensively studied and applied node proximity measure in graphs. For a pair of nodes $s$ and $t$ on a graph $G=(V,E)$, the PPR value $π(s,t)$ is defined as the probability that an $α$-discounted random walk from $s$ terminates at $t$, where the walk terminates with probability $α$ at each step. We study the classic Single-Source PPR query, which asks for PPR approximations from a given source node $s$ to all nodes in the graph. Specifically, we aim to provide approximations with absolute error guarantees, ensuring that the resultant PPR estimates $\hatπ(s,t)$ satisfy $\max_{t\in V}\big|\hatπ(s,t)-π(s,t)\big|\le\varepsilon$ for a given error bound $\varepsilon$. We propose an algorithm that achieves this with high probability, with an expected running time of - $\widetilde{O}\big(\sqrt{m}/\varepsilon\big)$ for directed graphs, where $m=|E|$; - $\widetilde{O}\big(\sqrt{d_{\mathrm{max}}}/\varepsilon\big)$ for undirected graphs, where $d_{\mathrm{max}}$ is the maximum node degree in the graph; - $\widetilde{O}\left(n^{γ-1/2}/\varepsilon\right)$ for power-law graphs, where $n=|V|$ and $γ\in\left(\frac{1}{2},1\right)$ is the extent of the power law. These sublinear bounds improve upon existing results. We also study the case when degree-normalized absolute error guarantees are desired, requiring $\max_{t\in V}\big|\hatπ(s,t)/d(t)-π(s,t)/d(t)\big|\le\varepsilon_d$ for a given error bound $\varepsilon_d$, where the graph is undirected and $d(t)$ is the degree of node $t$. We give an algorithm that provides this error guarantee with high probability, achieving an expected complexity of $\widetilde{O}\left(\sqrt{\sum_{t\in V}π(s,t)/d(t)}\big/\varepsilon_d\right)$. This improves over the previously known $O(1/\varepsilon_d)$ complexity.

Approximating Single-Source Personalized PageRank with Absolute Error Guarantees

TL;DR

This work studies the classic Single-Source PPR query, and proposes an algorithm that provides approximations with absolute error guarantees with high probability, achieving an expected complexity of

Abstract

Personalized PageRank (PPR) is an extensively studied and applied node proximity measure in graphs. For a pair of nodes

and

on a graph

, the PPR value

is defined as the probability that an

-discounted random walk from

terminates at

, where the walk terminates with probability

at each step. We study the classic Single-Source PPR query, which asks for PPR approximations from a given source node

to all nodes in the graph. Specifically, we aim to provide approximations with absolute error guarantees, ensuring that the resultant PPR estimates

satisfy

for a given error bound

. We propose an algorithm that achieves this with high probability, with an expected running time of -

for directed graphs, where

; -

for undirected graphs, where

is the maximum node degree in the graph; -

for power-law graphs, where

and

is the extent of the power law. These sublinear bounds improve upon existing results. We also study the case when degree-normalized absolute error guarantees are desired, requiring

for a given error bound

, where the graph is undirected and

is the degree of node

. We give an algorithm that provides this error guarantee with high probability, achieving an expected complexity of

. This improves over the previously known

complexity.

Paper Structure (20 sections, 17 theorems, 7 equations, 2 tables, 3 algorithms)

This paper contains 20 sections, 17 theorems, 7 equations, 2 tables, 3 algorithms.

Introduction
Problem Formulation
Prior Complexity Bounds
Motivations
Our Results
Other Related Work
Notations and Tools
Notations
Backward Push
Power-Law Assumption
Our Algorithm for the SSPPR-A Query
High-Level Ideas
Techniques
Main Algorithm
Analyses for the SSPPR-A Query
...and 5 more sections

Key Result

Lemma 4

Let $\pi'(s,v)$ denote the estimate for $\pi(s,v)$ obtained in Phase i of alg:AbsPPR. With probability at least $1-1/n^2$, we have $\frac{1}{2}\pi(s,v)\le\pi'(s,v)\le\frac{3}{2}\pi(s,v)$ for all $v\in V$ with $\pi(s,v)\ge\frac{1}{4}\varepsilon$, and $\pi'(s,v)\le\pi(s,v)+\frac{1}{4}\varepsilon$ for

Theorems & Definitions (20)

Definition 1: SSPPR-A Query: Approximate SSPPR Query with Absolute Error Bounds
Definition 2: SSPPR-D Query: Approximate SSPPR Query with Degree-Normalized Absolute Error Bounds
Lemma 4
Lemma 5
Lemma 6
Lemma 7
Theorem 8
Claim 9
Theorem 10
Theorem 11: Symmetry of PPR on Undirected Graphs avrachenkov2013choice
...and 10 more

Approximating Single-Source Personalized PageRank with Absolute Error Guarantees

TL;DR

Abstract

Approximating Single-Source Personalized PageRank with Absolute Error Guarantees

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (20)