Hierarchical community detection benchmark for heterogeneous inter-community connectivity
Brendan Cross, Boleslaw K. Szymanski
TL;DR
The paper tackles the need for robust benchmarks that capture hierarchical community structure and inter-community heterogeneity to stress-test modularity-based community detection methods against the resolution limit. It introduces the Hierarchical Generalized LFR (HGLFR), extending the LFR/GLFR framework with multiple hierarchy levels and level-specific mixing parameters, controlled by $L$, $\mu_L$, $\Delta_{\mu_L}$, and $S$. The approach preserves core distributional properties while enabling inter-community heterogeneity and potential resolution-limit phenomena, demonstrated through validation against LFR/GLFR and analysis of detectability across hierarchy levels. The work provides a more realistic benchmark for evaluating detection algorithms in multi-scale networks and highlights areas for further refinement in degree-heterogeneity control and edge assignment across hierarchical levels.
Abstract
Here, we introduce a new tool for community detection, a generator of networks, which uses parameters to control the structure of created networks. Typically, network scientists designing novel community detection algorithms use synthetically generated benchmarks with community structures that they intend to detect and scale the benchmark networks across size and density. Currently, available benchmarks use generators limited to the properties of the LFR and GLFR networks. We improve on these previous benchmarks with a new hierarchical benchmark, the HGLFR, that preserves the properties of the LFR and GLFR while extending them to include heterogeneous inter-community connectivity. Networks generated by this benchmark are shown to produce networks with structures triggering the resolution limit while maintaining assortative connectivity.
