Evaluation Metrics for Unsupervised Learning Algorithms
Julio-Omar Palacio-Niño, Fernando Berzal
TL;DR
This technical report analyzes how to evaluate clustering quality in unsupervised learning, foregrounding Kleinberg's impossibility theorem which shows no clustering criterion can satisfy all three axioms of scale invariance, richness, and consistency. It provides a comprehensive taxonomy of evaluation metrics, separating internal-validation (cohesion/separation, dendrogram-based measures) from external-validation (matching sets, pairwise correlation, and information-theoretic indices), and discusses null-hypothesis testing as a tool for assessing clustering tendency. The work also covers practical hyperparameter tuning strategies (grid/random/Bayesian) to optimize clustering under chosen validation criteria and highlights the need to combine multiple metrics for robust assessment. Collectively, it offers a structured framework for selecting and applying clustering evaluation methods across problem settings and algorithm families, emphasizing the role of hyperparameter optimization in achieving meaningful clustering results.
Abstract
Determining the quality of the results obtained by clustering techniques is a key issue in unsupervised machine learning. Many authors have discussed the desirable features of good clustering algorithms. However, Jon Kleinberg established an impossibility theorem for clustering. As a consequence, a wealth of studies have proposed techniques to evaluate the quality of clustering results depending on the characteristics of the clustering problem and the algorithmic technique employed to cluster data.
