Table of Contents
Fetching ...

Means of Hitting Times for Random Walks on Graphs: Connections, Computation, and Optimization

Haisong Xia, Wanyue Xu, Zuobai Zhang, Zhongzhi Zhang

TL;DR

This work develops a tight connection between absorbing random-walk centrality $H_j$ and the Kemeny constant $\mathcal{K}$, expressing both as quadratic forms in the Laplacian pseudoinverse $\boldsymbol{L}^{\dagger}$ and enabling scalable computation. It introduces Group Walk Centrality $H(S)$, proves its monotone and supermodular properties, and formulates the NP-hard MinGWC problem of selecting a size-$k$ vertex set that minimizes $H(S)$. The authors present a fast approximation framework, ApproxHK, to estimate $H_j$ and $\mathcal{K}$ in nearly linear time, and two greedy algorithms (DeterMinGWC and ApproxMinGWC) for MinGWC with provable guarantees, the latter achieving a $(1-\frac{k}{k-1}\frac{1}{e}-\epsilon)$-approximation in $\tilde{O}(km\epsilon^{-2})$ time. Extensive experiments on real and model networks validate the accuracy and scalability of ApproxHK and the greedy MinGWC methods, showing effectiveness on networks with millions of nodes. The results offer a practical toolkit for identifying influential node groups under a random-walk model and for applications in sensor placement, network design, and data mining.

Abstract

For random walks on graph $\mathcal{G}$ with $n$ vertices and $m$ edges, the mean hitting time $H_j$ from a vertex chosen from the stationary distribution to vertex $j$ measures the importance for $j$, while the Kemeny constant $\mathcal{K}$ is the mean hitting time from one vertex to another selected randomly according to the stationary distribution. In this paper, we first establish a connection between the two quantities, representing $\mathcal{K}$ in terms of $H_j$ for all vertices. We then develop an efficient algorithm estimating $H_j$ for all vertices and \(\mathcal{K}\) in nearly linear time of $m$. Moreover, we extend the centrality $H_j$ of a single vertex to $H(S)$ of a vertex set $S$, and establish a link between $H(S)$ and some other quantities. We further study the NP-hard problem of selecting a group $S$ of $k\ll n$ vertices with minimum $H(S)$, whose objective function is monotonic and supermodular. We finally propose two greedy algorithms approximately solving the problem. The former has an approximation factor $(1-\frac{k}{k-1}\frac{1}{e})$ and $O(kn^3)$ running time, while the latter returns a $(1-\frac{k}{k-1}\frac{1}{e}-ε)$-approximation solution in nearly-linear time of $m$, for any parameter $0<ε<1$. Extensive experiment results validate the performance of our algorithms.

Means of Hitting Times for Random Walks on Graphs: Connections, Computation, and Optimization

TL;DR

This work develops a tight connection between absorbing random-walk centrality and the Kemeny constant , expressing both as quadratic forms in the Laplacian pseudoinverse and enabling scalable computation. It introduces Group Walk Centrality , proves its monotone and supermodular properties, and formulates the NP-hard MinGWC problem of selecting a size- vertex set that minimizes . The authors present a fast approximation framework, ApproxHK, to estimate and in nearly linear time, and two greedy algorithms (DeterMinGWC and ApproxMinGWC) for MinGWC with provable guarantees, the latter achieving a -approximation in time. Extensive experiments on real and model networks validate the accuracy and scalability of ApproxHK and the greedy MinGWC methods, showing effectiveness on networks with millions of nodes. The results offer a practical toolkit for identifying influential node groups under a random-walk model and for applications in sensor placement, network design, and data mining.

Abstract

For random walks on graph with vertices and edges, the mean hitting time from a vertex chosen from the stationary distribution to vertex measures the importance for , while the Kemeny constant is the mean hitting time from one vertex to another selected randomly according to the stationary distribution. In this paper, we first establish a connection between the two quantities, representing in terms of for all vertices. We then develop an efficient algorithm estimating for all vertices and in nearly linear time of . Moreover, we extend the centrality of a single vertex to of a vertex set , and establish a link between and some other quantities. We further study the NP-hard problem of selecting a group of vertices with minimum , whose objective function is monotonic and supermodular. We finally propose two greedy algorithms approximately solving the problem. The former has an approximation factor and running time, while the latter returns a -approximation solution in nearly-linear time of , for any parameter . Extensive experiment results validate the performance of our algorithms.

Paper Structure

This paper contains 29 sections, 21 theorems, 97 equations, 7 figures, 4 tables, 4 algorithms.

Key Result

Lemma 2.2

Te91 Let $\mathcal{G}=(V,E,w)$ be a simple connected graph with $n$ vertices. Then the sum of weight times resistance distance over all pairs of adjacent vertices in $\mathcal{G}$ satisfies

Figures (7)

  • Figure 1: The first several iterations of the pseudofractal scale-free web.
  • Figure 2: Construction process for the Koch network.
  • Figure 3: The Cayley tree $\mathcal{C}_{3,5}$.
  • Figure 4: Illustrations of the Tower of Hanoi graph $\mathcal{H}_{3}$ and its extension $\overline{\mathcal{H}}_{3}$, as well as their vertex labeling.
  • Figure 5: GWC $H(S)$ of vertex group $S$ computed by four different algorithms, DeterMinGWC (Deter), ApproxMinGWC (Approx), Random, and Optimum, on four networks: Zebra (a), Zachary karate club (b), Contiguous USA (c), and Les Miserables (d).
  • ...and 2 more figures

Theorems & Definitions (39)

  • Definition 2.1: $\epsilon$-approximation
  • Lemma 2.2
  • Lemma 2.3
  • Lemma 2.4: BoRaZh11
  • Lemma 3.1
  • proof
  • Lemma 3.2
  • proof
  • Lemma 4.1: JL Lemma Ac01
  • Lemma 4.2: SDD Solver CoKyMiPaPeRaSu14KySa16
  • ...and 29 more