MANTRA: Temporal Betweenness Centrality Approximation through Sampling

Antonio Cruciani

MANTRA: Temporal Betweenness Centrality Approximation through Sampling

Antonio Cruciani

TL;DR

The paper tackles the challenge of computing temporal betweenness centrality on dynamic networks, where exact algorithms are impractical and even approximate methods struggle with guarantees. It introduces MANTRA, a sampling-based framework that extends estimators to all feasible $(\star)$-temporal path optimalities, and leverages Monte Carlo Empirical Rademacher Averages (c-MCERA) to derive data-dependent sample-complexity bounds involving $D^{(\star)}$, $\rho^{(\star)}$, and $\zeta^{(\star)}$. A fast diameter/average-path-length/ connectivity-rate approximation algorithm with complexity $\tilde{O}\left(\frac{\log n}{\varepsilon^2}\cdot |\mathcal{E}|\right)$ is developed to support these bounds. Empirical evaluation on real-world networks shows MANTRA outperforms the state-of-the-art ONBRA in running time, sample size, and memory consumption while maintaining high accuracy, illustrating scalable temporal centrality computation for large graphs. The framework, including an open-source Julia implementation, provides a practical tool for tasks like community detection and dynamic network analysis, with potential extensions to edge betweenness and other temporal metrics.

Abstract

We present MANTRA, a framework for approximating the temporal betweenness centrality of all nodes in a temporal graph. Our method can compute probabilistically guaranteed high-quality temporal betweenness estimates (of nodes and temporal edges) under all the feasible temporal path optimalities, presented in the work of Buß et al. (KDD, 2020). We provide a sample-complexity analysis of our method and speed up the temporal betweenness computation using a state-of-the-art progressive sampling approach based on Monte Carlo Empirical Rademacher Averages. Additionally, we provide an efficient sampling algorithm to approximate the temporal diameter, average path length, and other fundamental temporal graph characteristic quantities within a small error $\varepsilon$ with high probability. The running time of such approximation algorithm is $\tilde{\mathcal{O}}(\frac{\log n}{\varepsilon^2}\cdot |\mathcal{E}|)$, where $n$ is the number of nodes and $|\mathcal{E}|$ is the number of temporal edges in the temporal graph. We support our theoretical results with an extensive experimental analysis on several real-world networks and provide empirical evidence that the MANTRA framework improves the current state of the art in speed, sample size, and required space while maintaining high accuracy of the temporal betweenness estimates.

MANTRA: Temporal Betweenness Centrality Approximation through Sampling

TL;DR

-temporal path optimalities, and leverages Monte Carlo Empirical Rademacher Averages (c-MCERA) to derive data-dependent sample-complexity bounds involving

, and

. A fast diameter/average-path-length/ connectivity-rate approximation algorithm with complexity

is developed to support these bounds. Empirical evaluation on real-world networks shows MANTRA outperforms the state-of-the-art ONBRA in running time, sample size, and memory consumption while maintaining high accuracy, illustrating scalable temporal centrality computation for large graphs. The framework, including an open-source Julia implementation, provides a practical tool for tasks like community detection and dynamic network analysis, with potential extensions to edge betweenness and other temporal metrics.

Abstract

with high probability. The running time of such approximation algorithm is

, where

is the number of nodes and

is the number of temporal edges in the temporal graph. We support our theoretical results with an extensive experimental analysis on several real-world networks and provide empirical evidence that the MANTRA framework improves the current state of the art in speed, sample size, and required space while maintaining high accuracy of the temporal betweenness estimates.

Paper Structure (30 sections, 19 theorems, 42 equations, 8 figures, 1 table, 2 algorithms)

This paper contains 30 sections, 19 theorems, 42 equations, 8 figures, 1 table, 2 algorithms.

Introduction
Contributions.
Related Work
Preliminaries
Temporal Graphs, and Paths.
Temporal Betweenness Centrality.
Supremum Deviation and Empirical Rademacher Averages.
MANTRA: temporal Betweenness Centrality Approximation through Sampling
Temporal Betweenness Estimator
Sample Complexity bounds
Fast approximation of the characteristic quantities
The MANTRA Framework
Experimental Evaluation
Experimental Results
Efficiency and Scalability.
...and 15 more sections

Key Result

lemma thmcounterlemma

$\rho^{(\star)} = \sum_{v\in V}\texttt{b}^{(\star)}_{v}$

Figures (8)

Figure 1: Comparison between temporal diameter and the average number of internal nodes for the Shortest (foremost) and Prefix-Foremost temporal path optimalities. The approximation has been computed (over 10 runs) using our sampling algorithm using $256$ random seed nodes.
Figure 2: Experimental analysis for $\varepsilon\in\{0.01,0.007,0.005,0.001\}$. Comparison between the running times (a), sample sizes (b), and allocated memory (c) of ONBRA and MANTRA. (d) Supremum deviation of the absolute $\varepsilon$-approximation computed by MANTRA. The black line indicates that the two algorithms require the same amount of time/samples/memory, gray line (followed by a red mark) indicates that the algorithm required more than $1$TB of memory to run on that data set with that specific $\varepsilon$ value.
Figure 3: (a) Relation between the running time and the sample size of MANTRA for the shortest temporal betweenness with $\varepsilon$ as in Figure \ref{['fig:times_sh']}. (b) Comparison between MANTRA and the exact algorithm running times for the shortest temporal betweenness on the biggest temporal networks.
Figure 4: Example of the $(\star)$-temporal paths described in Definition \ref{['def::tps']}. Shortest: $(s\xrightarrow{56}x\xrightarrow{80}y\xrightarrow{92}z),(s\xrightarrow{22}a\xrightarrow{36}b\xrightarrow{40}z)$, Shortest-Foremost: $(s\xrightarrow{22}a\xrightarrow{36}b\xrightarrow{40}z)$, and Prefix-Foremost:$(s\xrightarrow{1}u\xrightarrow{2}v\xrightarrow{3}w\xrightarrow{4}z)$.
Figure 5: Experimental evaluation (over 10 runs) of the sampling algorithm using $256$ seed nodes. Comparison between temporal effective diameter and the connectivity rate for the Shortest (foremost) and Prefix-Foremost temporal path optimalities. Last two log-plots, are the running times comparisons between the exact and our approximation algorithm.
...and 3 more figures

Theorems & Definitions (36)

definition thmcounterdefinition
lemma thmcounterlemma
theorem thmcountertheorem
lemma thmcounterlemma
theorem thmcountertheorem: See Li_2001, Section $1$
theorem thmcountertheorem
theorem thmcountertheorem
theorem thmcountertheorem
theorem thmcountertheorem
definition thmcounterdefinition: Temporal Edge Betweenness
...and 26 more

MANTRA: Temporal Betweenness Centrality Approximation through Sampling

TL;DR

Abstract

MANTRA: Temporal Betweenness Centrality Approximation through Sampling

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (36)