A novel metric for community detection
Ke-ke Shang, Michael Small, Yan Wang, Di Yin, Shu Li
TL;DR
The paper addresses the problem that Modularity-based metrics may mischaracterize communities by assuming higher internal density; it proposes a predictability-based criterion for community detection. The method defines $S_{pr}=\frac{\sum_{i=1}^n {\frac{S_{in}^i - S_{all}^i}{S_{all}^i}}}{n}$, where $S_{in}^i$ and $S_{all}^i$ are link-prediction accuracies for internal and all links under the $i$th predictor, and it evaluates three link-prediction schemes (CN, LHN1, HDI) across five networks with eight detection algorithms. The results show that internal-link predictability generally exceeds all-link predictability and that $S_{pr}$ provides a more stable, robust ranking of algorithms than Modularity, while also exposing failures (e.g., negative $S_{pr}$) and revealing broader statistical patterns. This work suggests a more flexible and informative view of what constitutes a community and offers a practical tool to compare algorithms across diverse networks.
Abstract
Research into detection of dense communities has recently attracted increasing attention within network science, various metrics for detection of such communities have been proposed. The most popular metric -- Modularity -- is based on the so-called rule that the links within communities are denser than external links among communities, has become the default. However, this default metric suffers from ambiguity, and worse, all augmentations of modularity and based on a narrow intuition of what it means to form a "community". We argue that in specific, but quite common systems, links within a community are not necessarily more common than links between communities. Instead we propose that the defining characteristic of a community is that links are more predictable within a community rather than between communities. In this paper, based on the effect of communities on link prediction, we propose a novel metric for the community detection based directly on this feature. We find that our metric is more robustness than traditional modularity. Consequently, we can achieve an evaluation of algorithm stability for the same detection algorithm in different networks. Our metric also can directly uncover the false community detection, and infer more statistical characteristics for detection algorithms.
