Table of Contents
Fetching ...

Efficient Algorithms for Personalized PageRank Computation: A Survey

Mingji Yang, Hanzhi Wang, Zhewei Wei, Sibo Wang, Ji-Rong Wen

TL;DR

Personalized PageRank (PPR) is a traditional measure for node proximity on large graphs that reflects the importance between <inline-formula><tex-math notation="LaTeX">$\boldsymbol{s}$</tex-math><alternatives><mml:math><mml:mi mathvariant="bold">s</mml:mi>π</mml:mi><mml:mi mathvariant="bold">s</mml:mi

Abstract

Personalized PageRank (PPR) is a traditional measure for node proximity on large graphs. For a pair of nodes $s$ and $t$, the PPR value $π_s(t)$ equals the probability that an $α$-discounted random walk from $s$ terminates at $t$ and reflects the importance between $s$ and $t$ in a bidirectional way. As a generalization of Google's celebrated PageRank centrality, PPR has been extensively studied and has found multifaceted applications in many fields, such as network analysis, graph mining, and graph machine learning. Despite numerous studies devoted to PPR over the decades, efficient computation of PPR remains a challenging problem, and there is a dearth of systematic summaries and comparisons of existing algorithms. In this paper, we recap several frequently used techniques for PPR computation and conduct a comprehensive survey of various recent PPR algorithms from an algorithmic perspective. We classify these approaches based on the types of queries they address and review their methodologies and contributions. We also discuss some representative algorithms for computing PPR on dynamic graphs and in parallel or distributed environments.

Efficient Algorithms for Personalized PageRank Computation: A Survey

TL;DR

Personalized PageRank (PPR) is a traditional measure for node proximity on large graphs that reflects the importance between <inline-formula><tex-math notation="LaTeX"></tex-math><alternatives><mml:math><mml:mi mathvariant="bold">s</mml:mi>π</mml:mi><mml:mi mathvariant="bold">s</mml:mi

Abstract

Personalized PageRank (PPR) is a traditional measure for node proximity on large graphs. For a pair of nodes and , the PPR value equals the probability that an -discounted random walk from terminates at and reflects the importance between and in a bidirectional way. As a generalization of Google's celebrated PageRank centrality, PPR has been extensively studied and has found multifaceted applications in many fields, such as network analysis, graph mining, and graph machine learning. Despite numerous studies devoted to PPR over the decades, efficient computation of PPR remains a challenging problem, and there is a dearth of systematic summaries and comparisons of existing algorithms. In this paper, we recap several frequently used techniques for PPR computation and conduct a comprehensive survey of various recent PPR algorithms from an algorithmic perspective. We classify these approaches based on the types of queries they address and review their methodologies and contributions. We also discuss some representative algorithms for computing PPR on dynamic graphs and in parallel or distributed environments.
Paper Structure (33 sections, 3 theorems, 22 equations, 1 figure, 4 tables, 2 algorithms)

This paper contains 33 sections, 3 theorems, 22 equations, 1 figure, 4 tables, 2 algorithms.

Key Result

Theorem 1

For a preference vector $\boldsymbol{\sigma}$, we have $\boldsymbol{\pi}_{\boldsymbol{\sigma}}=\sum_{s\in V}\boldsymbol{\sigma}(s)\cdot\boldsymbol{\pi}_{s}$.

Figures (1)

  • Figure 1: A running example of Forward Push on a toy graph. $s$ is the source node, $\alpha$ is set to $0.2$ and $r_\mathrm{max}^{(\mathrm{f})}$ is set to $0.3$. Each step stands for a single push operation and updated information is marked in red.

Theorems & Definitions (4)

  • Definition 1: Probabilistic SSPPR Query with Relative Error Bounds
  • Theorem 1: The Linearity Theorem jeh2003scaling
  • Theorem 2: The Decomposition Theorem jeh2003scaling
  • Theorem 3: Symmetry of PPR on Undirected Graphs avrachenkov2013choice