Nearest Neighbour Equilibrium Clustering
David P. Hofmeyr
TL;DR
The paper tackles unsupervised clustering by introducing Nearest Neighbour Equilibrium Clustering (NNEC), a method that defines clusters through an equilibrium condition balancing size and cohesiveness of neighbourhoods. Clusters are grown iteratively from seeds and final assignments are determined by maximising a per-point membership strength, with automatic tuning of parameters $k$ and $\lambda$ via a normalization-based criterion. The approach is evaluated on 45 public datasets against a suite of competitive methods, showing that NNEC achieves the best average performance across AMI, ARI, and accuracy, while remaining simple and scalable. An open-source implementation is provided, highlighting the method’s practicality for automated exploratory clustering in data-rich settings.
Abstract
A novel and intuitive nearest neighbours based clustering algorithm is introduced, in which a cluster is defined in terms of an equilibrium condition which balances its size and cohesiveness. The formulation of the equilibrium condition allows for a quantification of the strength of alignment of each point to a cluster, with these cluster alignment strengths leading naturally to a model selection criterion which renders the proposed approach fully automatable. The algorithm is simple to implement and computationally efficient, and produces clustering solutions of extremely high quality in comparison with relevant benchmarks from the literature. R code to implement the approach is available from https://github.com/DavidHofmeyr/NNEC.
