Flow Divergence: Comparing Maps of Flows with Relative Entropy
Christopher Blöcker, Ingo Scholtes
TL;DR
Flow Divergence introduces a KL-inspired dissimilarity for network partitions that accounts for link patterns by tying the map equation to random-walk description length. By treating one partition as the reference true pattern and another as an estimator, the method yields the expected extra bits required to describe a random walk under the estimator, formalized as $D_F(\mathsf{M}_a||\mathsf{M}_b)$. Central to the approach are mapsim, modular coding, and a walking-on-maps construction that derives module-dependent transition rates, enabling robust comparisons across hierarchical depths. The framework demonstrates superior sensitivity to partition structure over traditional measures, reveals the cost of overfitting in incomplete data, and supports embedding and visualization of partition landscapes in real networks. This yields practical impact for evaluating community descriptions, diagnosing overfitting, and exploring the solution space of network partitions with a link-pattern-aware lens.
Abstract
Networks represent how the entities of a system are connected and can be partitioned differently, prompting ways to compare partitions. Common approaches for comparing network partitions include information-theoretic measures based on mutual information and set-theoretic measures such as the Jaccard index. These measures are often based on computing the agreement in terms of overlap between different partitions of the same set. However, they ignore link patterns which are essential for the organisation of networks. We propose flow divergence, an information-theoretic divergence measure for comparing network partitions, inspired by the ideas behind the Kullback-Leibler divergence and the map equation for community detection. Similar to the Kullback-Leibler divergence, flow divergence adopts a coding perspective and compares two network partitions $\mathsf{M}_a$ and $\mathsf{M}_b$ by considering the expected extra number of bits required to describe a random walk on a network using $\mathsf{M}_b$ relative to reference partition $\mathsf{M}_a$. Because flow divergence is based on random walks, it can be used to compare partitions with arbitrary and different depths. We show that flow divergence distinguishes between partitions that traditional measures consider to be equally good when compared to a reference partition. Applied to real networks, we use flow divergence to estimate the cost of overfitting in incomplete networks and to visualise the solution landscape of network partitions.
