A classification of overlapping clustering schemes for hypergraphs
Vilhelm Agdur
TL;DR
This work develops a rigorous, category-theoretic framework for overlapping clustering in hypergraphs, extending the representability program of Carlsson–Mémoli to overlapping partitions. It shows that a clustering scheme is representable if and only if it is excisive and functorial, up to a refinement notion when $k>1$, and that every excisive and functorial scheme is refined by a representable one; it also provides computational guarantees on graphs of bounded expansion for fixed representations. The paper introduces endofunctors $\Phi_{\mathfrak{R}}$ determined by a representing set $\mathfrak{R}$ and composes them with the $k$-line component functor $\Pi_k$ to realize clustering schemes as $\Pi_k \circ \Phi_{\mathfrak{R}}$, with deep results about (non)finite representability. Additionally, it establishes practical complexity bounds, showing linear-time computation relative to graph size and a polynomial bound depending on $|V|^{\alpha(\mathfrak{R})}$, where $\alpha(\mathfrak{R})$ is the maximum independence number in $\mathfrak{R}$, for graphs with bounded expansion. Overall, the work provides a principled, scalable framework to compare and construct clustering schemes with theoretical guarantees in hypergraph settings.
Abstract
Community detection in graphs is a problem that is likely to be relevant whenever network data appears, and consequently the problem has received much attention with many different methods and algorithms applied. However, many of these methods are hard to study theoretically, and they optimise for somewhat different goals. A general and rigorous account of the problem and possible methods remains elusive. We study the problem of finding overlapping clusterings of hypergraphs, continuing the line of research started by Carlsson and Mémoli (2013) of classifying clustering schemes as functors. We extend their notion of representability to the overlapping case, showing that any representable overlapping clustering scheme is excisive and functorial, and any excisive and functorial clustering scheme is isomorphic to a representable clustering scheme. We also note that, for simple graphs, any representable clustering scheme is computable in polynomial time on graphs of bounded expansion, with an exponent determined by the maximum independence number of a graph in the representing set. This result also applies to non-overlapping representable clustering schemes, and so may be of independent interest.
