A Unified Framework for Exploratory Learning-Aided Community Detection Under Topological Uncertainty
Yu Hou, Cong Tran, Ming Li, Won-Yong Shin
TL;DR
This work tackles overlapping community detection when the true network topology is unknown, introducing META-CODE, a unified framework that iteratively combines community-affiliation embedding via a reconstruction-trained GNN, exploration of the hidden network through strategically selected node queries, and network inference using an edge-connectivity Siamese model. The approach jointly optimizes a community-affiliation matrix and a sequence of queries, leveraging node metadata and progressively revealed edges to refine edge predictions and communities. Theoretical results demonstrate that querying nodes in overlapping regions accelerates exploration and that META-CODE scales linearly with the number of edges, while extensive experiments on real networks show substantial gains (up to 65.55% NMI) over competitive methods and strong evidence for the contribution of each module. Overall, META-CODE offers a practical, scalable solution for uncovering meaningful, overlapping communities under topological uncertainty with broad applicability in privacy-constrained or incomplete-network contexts.
Abstract
In social networks, the discovery of community structures has received considerable attention as a fundamental problem in various network analysis tasks. However, due to privacy concerns or access restrictions, the network structure is often uncertain, thereby rendering established community detection approaches ineffective without costly network topology acquisition. To tackle this challenge, we present META-CODE, a unified framework for detecting overlapping communities via exploratory learning aided by easy-to-collect node metadata when networks are topologically unknown (or only partially known). Specifically, META-CODE consists of three iterative steps in addition to the initial network inference step: 1) node-level community-affiliation embeddings based on graph neural networks (GNNs) trained by our new reconstruction loss, 2) network exploration via community-affiliation-based node queries, and 3) network inference using an edge connectivity-based Siamese neural network model from the explored network. Through extensive experiments on three real-world datasets including two large networks, we demonstrate: (a) the superiority of META-CODE over benchmark community detection methods, achieving remarkable gains up to 65.55% on the Facebook dataset over the best competitor among our selected competitive methods in terms of normalized mutual information (NMI), (b) the impact of each module in META-CODE, (c) the effectiveness of node queries in META-CODE based on empirical evaluations and theoretical findings, and (d) the convergence of the inferred network.
