Network mutual information measures for graph similarity
Helcio Felippe, Federico Battiston, Alec Kirkley
TL;DR
This work introduces a principled information-theoretic framework for graph similarity by constructing a family of graph mutual information measures that operate at different structural scales. It presents three encodings—edge-level overlap (NMI), degree-corrected neighborhood overlap (DC-NMI), and mesoscale structure via fixed partitions (MesoNMI)—to quantify shared information between node-aligned graphs. Through synthetic perturbations and real multilayer networks (e.g., FAO trade data), the authors demonstrate that microscale measures capture fine-grained similarity while Mesoscale NMI appropriately emphasizes coarser community-like structure, enabling scale-aware comparisons and robust downstream analyses. The approach is fast, interpretable, and adaptable to weighted, directed, and higher-order network representations, with open-source code and data available for reproducibility and broad application in network analysis and anomaly detection.
Abstract
A wide range of tasks in network analysis, such as clustering network populations or identifying anomalies in temporal graph streams, require a measure of the similarity between two graphs. To provide a meaningful data summary for downstream scientific analyses, the graph similarity measures used for these tasks must be principled, interpretable, and capable of distinguishing meaningful overlapping network structure from statistical noise at different scales of interest. Here we derive a family of graph mutual information measures that satisfy these criteria and are constructed using only fundamental information theoretic principles. Our measures capture the information shared among networks according to different encodings of their structural information, with our mesoscale mutual information measure allowing for network comparison under any specified network coarse-graining. We test our measures in a range of applications on real and synthetic network data, finding that they effectively highlight intuitive aspects of network similarity across scales in a variety of systems.
