Table of Contents
Fetching ...

Why the Metric Backbone Preserves Community Structure

Maximilien Dreveton, Charbel Chucri, Matthias Grossglauser, Patrick Thiran

TL;DR

This work analyzes the metric backbone of a broad class of weighted random graphs with communities, and formally proves the robustness of the community structure with respect to the deletion of all the edges that are not in the metric backbone.

Abstract

The metric backbone of a weighted graph is the union of all-pairs shortest paths. It is obtained by removing all edges $(u,v)$ that are not the shortest path between $u$ and $v$. In networks with well-separated communities, the metric backbone tends to preserve many inter-community edges, because these edges serve as bridges connecting two communities, but tends to delete many intra-community edges because the communities are dense. This suggests that the metric backbone would dilute or destroy the community structure of the network. However, this is not borne out by prior empirical work, which instead showed that the metric backbone of real networks preserves the community structure of the original network well. In this work, we analyze the metric backbone of a broad class of weighted random graphs with communities, and we formally prove the robustness of the community structure with respect to the deletion of all the edges that are not in the metric backbone. An empirical comparison of several graph sparsification techniques confirms our theoretical finding and shows that the metric backbone is an efficient sparsifier in the presence of communities.

Why the Metric Backbone Preserves Community Structure

TL;DR

This work analyzes the metric backbone of a broad class of weighted random graphs with communities, and formally proves the robustness of the community structure with respect to the deletion of all the edges that are not in the metric backbone.

Abstract

The metric backbone of a weighted graph is the union of all-pairs shortest paths. It is obtained by removing all edges that are not the shortest path between and . In networks with well-separated communities, the metric backbone tends to preserve many inter-community edges, because these edges serve as bridges connecting two communities, but tends to delete many intra-community edges because the communities are dense. This suggests that the metric backbone would dilute or destroy the community structure of the network. However, this is not borne out by prior empirical work, which instead showed that the metric backbone of real networks preserves the community structure of the original network well. In this work, we analyze the metric backbone of a broad class of weighted random graphs with communities, and we formally prove the robustness of the community structure with respect to the deletion of all the edges that are not in the metric backbone. An empirical comparison of several graph sparsification techniques confirms our theoretical finding and shows that the metric backbone is an efficient sparsifier in the presence of communities.
Paper Structure (35 sections, 7 theorems, 110 equations, 9 figures, 3 tables, 1 algorithm)

This paper contains 35 sections, 7 theorems, 110 equations, 9 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

Let $(z,G) \sim \mathrm{wSBM} ( n, \pi, p, F)$. Suppose that Assumptions assumption:wSBM_scaling and assumption:regularity_fs hold and let $\tau_{\min}$ and $\tau_{\max}$ be defined following Equation eq:def_operator. Then, for two vertices $u$ and $v$ chosen uniformly at random in blocks $a$ and $

Figures (9)

  • Figure 1: Effect of sparsification on the performance of clustering algorithms on various data sets. We observe that the metric backbone and the spectral sparsification retain equally well the community structure across all data sets and for all clustering algorithms tested. Thresholding often yields several disconnected components of small sizes, impacting the performance of clustering algorithms on $G^{\theta}$.
  • Figure 2: Graphs obtained from Primary school data set, after taking the metric backbone (Figure \ref{['fig:primarySchool_backbone']}) and after thresholding (Figure \ref{['fig:primarySchool_threshold']}), are drawn using the same layout. Vertex colors represent the true clusters. Edges present in the metric backbone but not in the threshold graph are highlighted in red. Edges present in the threshold graph, but not in the metric backbone, are highlighted in blue.
  • Figure 3: Performance of spectral clustering on subsets of MNIST, FashionMNIST datasets, and on the HAR dataset. The ARI is averaged over 10 trials; error bars show the standard error of the mean.
  • Figure 4: Performance of Poisson learning on subsets of MNIST, FashionMNIST datasets, and the HAR dataset. The ARI is averaged over 100 trials, and error bars show the standard error of the mean.
  • Figure 5: Graphs obtained from Primary school data set, after taking the metric backbone (Figure \ref{['fig:primarySchool_backbone_vs_Sp']}) and after spectral sparsification (Figure \ref{['fig:primarySchool_Sp']}), drawn using the same layout. Vertex colors represent the true clusters. Edges present in the metric backbone but not in the spectral sparsifier graph are highlighted in red. Similarly, edges present in the spectral sparsifier graph, but not in the metric backbone, are highlighted in blue.
  • ...and 4 more figures

Theorems & Definitions (19)

  • Definition 1
  • Remark 1
  • Proposition 1
  • Theorem 1
  • Example 1
  • Example 2
  • Theorem 2
  • proof : Proof of Proposition \ref{['prop:weight_hop_count_shortest_paths']} for exponentially distributed costs
  • proof : Proof of Proposition \ref{['prop:weight_hop_count_shortest_paths']} with non-exponentially distributed costs
  • proof : Proof of Theorem \ref{['thm:mb_keeps_community']}
  • ...and 9 more