Fast Maximization of Current Flow Group Closeness Centrality
Haisong Xia, Zhongzhi Zhang
TL;DR
This work tackles maximizing current flow closeness centrality for a node group $S$ of size $k$ under large-scale graphs, where $C(S)=\dfrac{n}{\mathrm{Tr}(\boldsymbol{L}_{-S}^{-1})}$. It introduces two greedy Monte Carlo algorithms, ForestCFCM and SchurCFCM, based on spanning-forest sampling and the Schur complement, and proves a $1-\dfrac{k}{k-1}\dfrac{1}{\mathrm{e}}-\epsilon$-approximation with nearly-linear time. ForestCFCM relies on unbiased forest-based estimators and adaptive sampling, while SchurCFCM leverages an auxiliary root set $T$ to obtain stronger diagonal dominance and faster sampling. Extensive experiments on real networks show substantial speedups (up to 370×) over the state-of-the-art, with SchurCFCM delivering the best overall efficiency and effectiveness, enabling CFCC maximization on graphs with millions of nodes. These methods thus enable scalable identification of crucial node groups in large-scale network analysis.
Abstract
Derived from effective resistances, the current flow closeness centrality (CFCC) for a group of nodes measures the importance of node groups in an undirected graph with $n$ nodes. Given the widespread applications of identifying crucial nodes, we investigate the problem of maximizing CFCC for a node group $S$ subject to the cardinality constraint $|S|=k\ll n$. Despite the proven NP-hardness of this problem, we propose two novel greedy algorithms for its solution. Our algorithms are based on spanning forest sampling and Schur complement, which exhibit nearly linear time complexities and achieve an approximation factor of $1-\frac{k}{k-1}\frac{1}{\mathrm{e}}-ε$ for any $0<ε<1$. Extensive experiments on real-world graphs illustrate that our algorithms outperform the state-of-the-art method in terms of efficiency and effectiveness, scaling to graphs with millions of nodes.
