Repelling Random Walks

Isaac Reid; Eli Berger; Krzysztof Choromanski; Adrian Weller

Repelling Random Walks

Isaac Reid, Eli Berger, Krzysztof Choromanski, Adrian Weller

TL;DR

This work addresses the challenge of efficiently sampling on graphs without bias by introducing repelling random walks, a discrete quasi-Monte Carlo scheme that induces correlations among multiple walkers while preserving each walk's marginal transition probabilities. The method yields unbiased estimators with strengthened concentration and is simple to implement as a drop-in modification (sampling without replacement within neighbor blocks). The authors develop theory and demonstrate strong gains across three domains: graph-kernel estimation via Graph Random Features, PageRank approximation, and graphlet concentration estimation, including a theoretical variance-reduction result for kernels and a general PageRank variance bound. Empirically, repelling walkers consistently outperform iid walkers across synthetic and real graphs, suggesting broad practical impact for graph-based statistical estimation and learning tasks. The work opens avenues for further theoretical analysis and extensions of quasi-Monte Carlo sampling to a wider class of graph estimators and networks.

Abstract

We present a novel quasi-Monte Carlo mechanism to improve graph-based sampling, coined repelling random walks. By inducing correlations between the trajectories of an interacting ensemble such that their marginal transition probabilities are unmodified, we are able to explore the graph more efficiently, improving the concentration of statistical estimators whilst leaving them unbiased. The mechanism has a trivial drop-in implementation. We showcase the effectiveness of repelling random walks in a range of settings including estimation of graph kernels, the PageRank vector and graphlet concentrations. We provide detailed experimental evaluation and robust theoretical guarantees. To our knowledge, repelling random walks constitute the first rigorously studied quasi-Monte Carlo scheme correlating the directions of walkers on a graph, inviting new research in this exciting nascent domain.

Repelling Random Walks

TL;DR

Abstract

Paper Structure (14 sections, 3 theorems, 63 equations, 4 figures, 2 tables, 2 algorithms)

This paper contains 14 sections, 3 theorems, 63 equations, 4 figures, 2 tables, 2 algorithms.

Introduction and related work
Repelling random walks
Application 1: approximating graph kernels
Pointwise kernel estimation
Downstream applications: kernel regression for node attribute prediction
Application 2: approximating PageRank
Application 3: approximating graphlet concentrations
Conclusion
Ethics and reproducibility
Relative contributions and acknowledgements
Appendices
Proof of Theorem \ref{['thm:kern_est']} (superiority of repelling random walks for kernel estimation
Proof of Theorem \ref{['thm:pagerank']} (superiority of repelling random walks for PageRank estimation
Proof of Theorem \ref{['thm:stepbystep']} (variance of step-by-step linear functions is reduced by transient repulsion)

Key Result

Theorem 3.1

Consider graph nodes indexed $(i,j)$ separated by at least $2$ edges. In the limit $\sigma \to 0$, provided the number of walkers in the transient repelling ensemble is smaller than or equal to the node degrees $d_{\{i,j\}}$ and the edge-weights of $\mathbf{W}$ are equal, for both i) trees and ii) $2$-dimensional grids.

Figures (4)

Figure 1: Schematic for behaviour of repelling random walkers at a particular timestep. By sampling from each 'block' (blue and green rectangles) without replacement we get a more even distribution over neighbours, without changing the marginal probabilities.
Figure 2: Relative Frobenius norm of estimates of the $2$-regularised Laplace kernel (lower is better) vs. number of random walks for: i) vanilla GRFs; ii) GRFs with antithetic termination reid2023quasi ('q-a-GRFs'); iii) GRFs with repelling walks ('q-r-GRFs'); iv) GRFs with both antithetic termination and repelling walks ('q-ar-GRFs'). Using both QMC schemes together gives the best results for all graphs considered and the gains are large (sometimes a factor of $>2$). $N$ gives the number of nodes, $p$ is the edge-generation probability for the Erdös-Rényi graphs, and $d$ is the $d$-regular node degree. One standard deviation on the mean error is shaded but is too small to easily see.
Figure 3: Graphlets for $k=3$
Figure 4: Mean square error on estimates of $k=3$ graphlet concentrations with different numbers of random walks on different graphs. Lower is better. Using the repelling scheme consistently improves the quality of the estimate compared to independent walks.

Theorems & Definitions (6)

Definition 2.1: Repelling random walks
Definition 2.2: Transient repelling random walks
Theorem 3.1: Superiority of repelling random walks for kernel estimation
Theorem 4.2: Superiority of repelling random walks for PageRank estimation
Definition 4.3: Step-by-step linear functions
Theorem 4.4: Variance of step-by-step linear functions is reduced by transient repulsion

Repelling Random Walks

TL;DR

Abstract

Repelling Random Walks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (6)