Approximating Spanning Centrality with Random Bouquets
Gökhan Göktürk, Kamer Kaya
TL;DR
This paper tackles the high computational cost of All Edges Spanning Centrality (AESC) by introducing Bouquets, a hash-based, sampling-aware approach that clusters random walks into vectorizable groups and arranges them to maximize data locality. The method combines a hash-based RNG, SIMD-enabled RandomBouquet generation, and SABA to achieve large speedups over the state-of-the-art TGT+—up to ~100× when using multiple cores—while preserving approximation quality. Key contributions include a detailed algorithmic design for bouquet-based random walks, extensive randomness and cache-growth evaluations, and practical implementation choices that yield scalable AESC performance on real graphs and synthetic benchmarks. The work demonstrates that high-throughput, accurate AESC approximation is attainable on commodity CPUs, unlocking broader applicability in network analysis tasks such as resilience and connectivity studies.
Abstract
Spanning Centrality is a measure used in network analysis to determine the importance of an edge in a graph based on its contribution to the connectivity of the entire network. Specifically, it quantifies how critical an edge is in terms of the number of spanning trees that include that edge. The current state-of-the-art for All Edges Spanning Centrality~(AESC), which computes the exact centrality values for all the edges, has a time complexity of $\mathcal{O}(mn^{3/2})$ for $n$ vertices and $m$ edges. This makes the computation infeasible even for moderately sized graphs. Instead, there exist approximation algorithms which process a large number of random walks to estimate edge centralities. However, even the approximation algorithms can be computationally overwhelming, especially if the approximation error bound is small. In this work, we propose a novel, hash-based sampling method and a vectorized algorithm which greatly improves the execution time by clustering random walks into {\it Bouquets}. On synthetic random walk benchmarks, {\it Bouquets} performs $7.8\times$ faster compared to naive, traditional random-walk generation. We also show that the proposed technique is scalable by employing it within a state-of-the-art AESC approximation algorithm, {\sc TGT+}. The experiments show that using Bouquets yields more than $100\times$ speed-up via parallelization with 16 threads.
