Table of Contents
Fetching ...

Distance Backbones Optimize Spreading Dynamics and Centrality Ranks in the Sparsification of Complex Networks

Miguel Bernardo Pereira, Felipe Xavier Costa, Luís M. Rocha

Abstract

Detailed network models of social, biological and other complex systems are often dense, which increases their computational complexity in simulations and analysis. To address this challenge, graph sparsification is used to remove edges while preserving desired network properties. Distance backbones of weighted graphs, which remove edges that break a generalized triangle inequality for any given path-length measure, preserve all shortest paths of weighted graphs. They have been shown to typically sparsify graphs more, as well as preserve community structure and spreading dynamics better than alternative state-of-the-art methods. Here, We show that they significantly best preserve node centrality ranks, as well as local and global dynamics in spreading phenomena. This is done by introducing the distance backbone synthesis (DBS) to progressively sparsify weighted graphs according to a general family of nested distance backbones, whereby each edge is associated with the smallest distance backbone in which it appears. DBS provides a principled and natural method to sweep all degrees of sparsification possible while preserving connectivity, allowing us to precisely study (directed and undirected) weighted graph sparsification under multi-objective criteria. It provides an algebraically-principled explanation of edge importance by revealing the precise topological space associated with each edge. The theory is demonstrated with a battery of social contact networks obtained from real-world social activity in different scenarios. Our study also shows that the optimal preservation of node centrality and spreading dynamics happens for the distance backbone obeying the generalized triangle inequality for the path-length measure $g(x, y) = (\sqrt[3]{x}+\sqrt[3]{y})^3$, which removes more than half of the edges from the empirical networks studied.

Distance Backbones Optimize Spreading Dynamics and Centrality Ranks in the Sparsification of Complex Networks

Abstract

Detailed network models of social, biological and other complex systems are often dense, which increases their computational complexity in simulations and analysis. To address this challenge, graph sparsification is used to remove edges while preserving desired network properties. Distance backbones of weighted graphs, which remove edges that break a generalized triangle inequality for any given path-length measure, preserve all shortest paths of weighted graphs. They have been shown to typically sparsify graphs more, as well as preserve community structure and spreading dynamics better than alternative state-of-the-art methods. Here, We show that they significantly best preserve node centrality ranks, as well as local and global dynamics in spreading phenomena. This is done by introducing the distance backbone synthesis (DBS) to progressively sparsify weighted graphs according to a general family of nested distance backbones, whereby each edge is associated with the smallest distance backbone in which it appears. DBS provides a principled and natural method to sweep all degrees of sparsification possible while preserving connectivity, allowing us to precisely study (directed and undirected) weighted graph sparsification under multi-objective criteria. It provides an algebraically-principled explanation of edge importance by revealing the precise topological space associated with each edge. The theory is demonstrated with a battery of social contact networks obtained from real-world social activity in different scenarios. Our study also shows that the optimal preservation of node centrality and spreading dynamics happens for the distance backbone obeying the generalized triangle inequality for the path-length measure , which removes more than half of the edges from the empirical networks studied.
Paper Structure (16 sections, 18 equations, 4 figures, 3 tables)

This paper contains 16 sections, 18 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Progressive sparsification of a social contact network from a High School in Marseille, France mastrandrea2015contact described in Sec. \ref{['0_main:sec:social_networks']} by the Distance Backbone Synthesis, $\textbf{DBS}$. (Below) The relative size, $\mathcal{X}$ of the subgraph found via the backbone synthesis based on path-length measure in Equation \ref{['0_main:eq:dombi_tdnorms']} varying with $\lambda^D$. When $\mathcal{X}=1$ there is no sparsification and when $\mathcal{X}=0$ we have the largest possible sparsification. (Above) Important subgraphs in this sparsification: the drastic backbone $B^{g_{0}}$, which is the original graph $G$, the $B^{g_{0.5}}$ backbone, the metric backbone $B^{m} = B^{g_{1}}$, the euclidean backbone $B^{g_{2}}$ and the ultra-metric backbone $B^{um} = B^{g_{+\infty}}$.
  • Figure 2: Sparsification effect on social contact networks. From the 12 social contact networks described in Sec. \ref{['0_main:sec:social_networks']}, we have a distribution of the area under the curve for different sparsification levels of (Left) eigenvector centrality rank, $\rho^C$, (Middle) average time of infection, $\rho^t$, and (Right ) time at which half of the nodes becomes infected in the original network relative to the same time measured in the sparsified network, $\xi^{-1}_{0.5}$ (Details in Sec. \ref{['0_main:sec:performance_metrics']}). Mann-Whitney U-Test was used to establish if the distribution for $\textbf{DBS}$ (green) is significantly larger than the distribution for each of the sparsifcation methods discussed in text (see legend), obtaining the annotated $p_{vals}$.
  • Figure 3: Optimal backbone that preserve desired network features. We measure the optimal $\lambda^D$ found via a weighted sum of the eigenvector centrality ranking, $\rho^C$, infection timing ranking, $\rho^t$, infection timescale, $\xi^{-1}_{0.5}$, and sparsity, $1-\mathcal{X}$, over various scenarios described in Section \ref{['0_main:sec:optimization_methods']} per network. We plot the distribution of the optimal $\lambda^D$ over the 12 social contact networks considered. The particular cases of the metric backbone (light green) and the product backbone (light blue) are highlighted. The later is a small range of values for it can only be defined in a different parametrization, as discussed in Supplementary Section B.We also shown the median and inter-quartile range, over the 12 networks, of the median $\lambda_{ij}$ in black.
  • Figure 4: Example of the sparsification performance metrics on the social contact network from a high school in Marseilles, France, mastrandrea2015contact described in Sec. \ref{['0_main:sec:social_networks']}. Curves for each sparsifcation method (per legend) are shown for different levels of sparsification, $\mathcal{X}$. (Top-Left) Spearman's rank correlation between the eigenvector centrality of nodes in the sparsified network relative to the original network, $\rho^C$. (Bottom-Left) Time at which half of the nodes becomes infected in the original network relative to the same time measured in the sparsified network, $\xi^{-1}_{0.5}$. Error-bars quantify the standard-deviation relative to choosing different seed nodes in the simulations. (Top-Right) Average time in which each node got infected in the original graph considering a specific seed node, horizontal axis, and in networks sparsified by the Distance Backbone Synthesis $\textbf{DBS}$, vertical axis. Each color corresponds to a different sparsification level, $\mathcal{X}$, which is annotated with the Spearman's rank correlation between those times, $\rho^t$. (Bottom-Right) Spearman's rank correlation between average time of infection, $\rho^t$. Error-bars quantify the standard-deviation relative to choosing different seed nodes in the simulations.