A statistical test for network similarity
Pierre Miasnikof, Alexander Y. Shetopaloff
TL;DR
The paper presents a graph comparison framework that avoids node matching by converting graphs into all-pairs Jaccard distance matrices, treating these distances as empirical distributions, and comparing graphs with the Kolmogorov-Smirnov statistic. This probabilistic, multi-scale approach applies to both directed and undirected networks and is demonstrated across synthetic and real-world data, including change-point and anomaly scenarios. The results show that the KS-based distance between distance distributions accurately captures (dis)similarity and remains informative under various structural perturbations, though statistical significance should be interpreted within domain context. The work highlights the method's practicality for monitoring network evolution and its limitations when graphs become disconnected or extremely sparse.
Abstract
In this article, we revisit and expand our prior work on graph similarity. As with our earlier work, we focus on a view of similarity which does not require node correspondence between graphs under comparison. Our work is suited to the temporal study of networks, change-point and anomaly detection and simple comparisons of static graphs. It provides a similarity metric for the study of (weakly) connected graphs. Our work proposes a metric designed to compare networks and assess the (dis)similarity between them. For example, given three different graphs with possibly different numbers of nodes, $G_1$, $G_2$ and $G_3$, we aim to answer two questions: a) "How different is $G_1 $ from $G_2$?" and b) "Is graph $G_3$ more similar to $G_1$ or to $G_2$?". We illustrate the value of our test and its accuracy through several new experiments, using synthetic and real-world graphs.
