Graph Encoder Ensemble for Simultaneous Vertex Embedding and Community Detection
Cencheng Shen, Youngser Park, Carey E. Priebe
TL;DR
This work tackles simultaneous vertex embedding, community detection, and unknown community size in graphs by introducing a graph encoder ensemble built on a normalized one-hot encoder ($GEE1$) and a rank-based cluster-size measure (MRI). The method yields a linear-time algorithm that combines $Z = \mathbf{A}\mathbf{W}$ embeddings, $L2$ normalization, MRI-guided model selection, and ensemble $k$-means clustering to determine $K$ and vertex labels. Empirical results on SBM and DC-SBM simulations show that normalization and ensemble components substantially improve clustering accuracy (ARI) and cluster-size recovery, compared to normalization-free and spectral baselines. The approach offers a scalable, unified framework for graph analytics with practical impact on large networks where the true number of communities is unknown.
Abstract
In this paper, we introduce a novel and computationally efficient method for vertex embedding, community detection, and community size determination. Our approach leverages a normalized one-hot graph encoder and a rank-based cluster size measure. Through extensive simulations, we demonstrate the excellent numerical performance of our proposed graph encoder ensemble algorithm.
