Means of Hitting Times for Random Walks on Graphs: Connections, Computation, and Optimization

Haisong Xia; Wanyue Xu; Zuobai Zhang; Zhongzhi Zhang

Means of Hitting Times for Random Walks on Graphs: Connections, Computation, and Optimization

Haisong Xia, Wanyue Xu, Zuobai Zhang, Zhongzhi Zhang

TL;DR

This work develops a tight connection between absorbing random-walk centrality $H_j$ and the Kemeny constant $\mathcal{K}$, expressing both as quadratic forms in the Laplacian pseudoinverse $\boldsymbol{L}^{\dagger}$ and enabling scalable computation. It introduces Group Walk Centrality $H(S)$, proves its monotone and supermodular properties, and formulates the NP-hard MinGWC problem of selecting a size-$k$ vertex set that minimizes $H(S)$. The authors present a fast approximation framework, ApproxHK, to estimate $H_j$ and $\mathcal{K}$ in nearly linear time, and two greedy algorithms (DeterMinGWC and ApproxMinGWC) for MinGWC with provable guarantees, the latter achieving a $(1-\frac{k}{k-1}\frac{1}{e}-\epsilon)$-approximation in $\tilde{O}(km\epsilon^{-2})$ time. Extensive experiments on real and model networks validate the accuracy and scalability of ApproxHK and the greedy MinGWC methods, showing effectiveness on networks with millions of nodes. The results offer a practical toolkit for identifying influential node groups under a random-walk model and for applications in sensor placement, network design, and data mining.

Abstract

For random walks on graph $\mathcal{G}$ with $n$ vertices and $m$ edges, the mean hitting time $H_j$ from a vertex chosen from the stationary distribution to vertex $j$ measures the importance for $j$, while the Kemeny constant $\mathcal{K}$ is the mean hitting time from one vertex to another selected randomly according to the stationary distribution. In this paper, we first establish a connection between the two quantities, representing $\mathcal{K}$ in terms of $H_j$ for all vertices. We then develop an efficient algorithm estimating $H_j$ for all vertices and $\mathcal{K}$ in nearly linear time of $m$. Moreover, we extend the centrality $H_j$ of a single vertex to $H(S)$ of a vertex set $S$, and establish a link between $H(S)$ and some other quantities. We further study the NP-hard problem of selecting a group $S$ of $k\ll n$ vertices with minimum $H(S)$, whose objective function is monotonic and supermodular. We finally propose two greedy algorithms approximately solving the problem. The former has an approximation factor $(1-\frac{k}{k-1}\frac{1}{e})$ and $O(kn^3)$ running time, while the latter returns a $(1-\frac{k}{k-1}\frac{1}{e}-ε)$-approximation solution in nearly-linear time of $m$, for any parameter $0<ε<1$. Extensive experiment results validate the performance of our algorithms.

Means of Hitting Times for Random Walks on Graphs: Connections, Computation, and Optimization

TL;DR

This work develops a tight connection between absorbing random-walk centrality

and the Kemeny constant

, expressing both as quadratic forms in the Laplacian pseudoinverse

and enabling scalable computation. It introduces Group Walk Centrality

, proves its monotone and supermodular properties, and formulates the NP-hard MinGWC problem of selecting a size-

vertex set that minimizes

. The authors present a fast approximation framework, ApproxHK, to estimate

and

in nearly linear time, and two greedy algorithms (DeterMinGWC and ApproxMinGWC) for MinGWC with provable guarantees, the latter achieving a

-approximation in

time. Extensive experiments on real and model networks validate the accuracy and scalability of ApproxHK and the greedy MinGWC methods, showing effectiveness on networks with millions of nodes. The results offer a practical toolkit for identifying influential node groups under a random-walk model and for applications in sensor placement, network design, and data mining.

Abstract

For random walks on graph

with

vertices and

edges, the mean hitting time

from a vertex chosen from the stationary distribution to vertex

measures the importance for

, while the Kemeny constant

is the mean hitting time from one vertex to another selected randomly according to the stationary distribution. In this paper, we first establish a connection between the two quantities, representing

in terms of

for all vertices. We then develop an efficient algorithm estimating

for all vertices and

in nearly linear time of

. Moreover, we extend the centrality

of a single vertex to

of a vertex set

, and establish a link between

and some other quantities. We further study the NP-hard problem of selecting a group

vertices with minimum

, whose objective function is monotonic and supermodular. We finally propose two greedy algorithms approximately solving the problem. The former has an approximation factor

and

running time, while the latter returns a

-approximation solution in nearly-linear time of

, for any parameter

. Extensive experiment results validate the performance of our algorithms.

Means of Hitting Times for Random Walks on Graphs: Connections, Computation, and Optimization

TL;DR

Abstract

Means of Hitting Times for Random Walks on Graphs: Connections, Computation, and Optimization

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (39)