Bounded Graph Clustering with Graph Neural Networks
Kibidi Neocosmos, Diego Baptista, Nicole Ludwig
TL;DR
The paper tackles the issue that graph neural networks for community detection often fail to produce a user-specified number of clusters. It introduces a constraint-based approach that bounds the number of output communities by modifying the loss with a row-normalized cluster-assignment constraint and a balance regularizer, enabling ranges or exact counts. Empirical results on synthetic SBMs and real networks show the constraint effectively enforces bounds, improves clustering quality when combined with regularization, and preserves runtime. The work also outlines limitations and avenues for future research, such as searching for optimal numbers of communities and evaluating performance under weaker community structure.
Abstract
In community detection, many methods require the user to specify the number of clusters in advance since an exhaustive search over all possible values is computationally infeasible. While some classical algorithms can infer this number directly from the data, this is typically not the case for graph neural networks (GNNs): even when a desired number of clusters is specified, standard GNN-based methods often fail to return the exact number due to the way they are designed. In this work, we address this limitation by introducing a flexible and principled way to control the number of communities discovered by GNNs. Rather than assuming the true number of clusters is known, we propose a framework that allows the user to specify a plausible range and enforce these bounds during training. However, if the user wants an exact number of clusters, it may also be specified and reliably returned.
