Interpretable Clustering: A Survey

Lianyu Hu; Mudi Jiang; Junjie Dong; Xinying Liu; Zengyou He

Interpretable Clustering: A Survey

Lianyu Hu, Mudi Jiang, Junjie Dong, Xinying Liu, Zengyou He

TL;DR

This survey addresses the problem of opaque clustering by formalizing interpretability within clustering and organizing methods across pre-, in-, and post-clustering stages. It introduces a four-criterion taxonomy (process stage, interpretable model, interpretability level, data modality) and links interpretable clustering to supervised XAI concepts, outlining intrinsic and post-hoc approaches. The paper comprehensively reviews pre-clustering feature extraction/selection, in-clustering models (decision trees, rules, prototypes, convex polyhedral), and post-clustering surrogates, emphasizing optimization-based formulations and trade-offs between interpretability and clustering quality. It highlights open challenges, such as scalability and evaluation of interpretability across diverse data types, and provides a repository for accessible methods to facilitate adoption in high-stakes domains.

Abstract

In recent years, much of the research on clustering algorithms has primarily focused on enhancing their accuracy and efficiency, frequently at the expense of interpretability. However, as these methods are increasingly being applied in high-stakes domains such as healthcare, finance, and autonomous systems, the need for transparent and interpretable clustering outcomes has become a critical concern. This is not only necessary for gaining user trust but also for satisfying the growing ethical and regulatory demands in these fields. Ensuring that decisions derived from clustering algorithms can be clearly understood and justified is now a fundamental requirement. To address this need, this paper provides a comprehensive and structured review of the current state of explainable clustering algorithms, identifying key criteria to distinguish between various methods. These insights can effectively assist researchers in making informed decisions about the most suitable explainable clustering methods for specific application contexts, while also promoting the development and adoption of clustering algorithms that are both efficient and transparent. For convenient access and reference, an open repository organizes representative and emerging interpretable clustering methods under the taxonomy proposed in this survey, available at https://github.com/hulianyu/Awesome-Interpretable-Clustering

Interpretable Clustering: A Survey

TL;DR

Abstract

Paper Structure (19 sections, 2 figures, 3 tables)

This paper contains 19 sections, 2 figures, 3 tables.

Introduction
The need for interpretable clustering
What is interpretable clustering?
What is a good interpretable clustering method?
A taxonomy of interpretable clustering methods
Conceptual correspondence between interpretable clustering and supervised XAI
Interpretable pre-clustering methods
Interpretable in-clustering methods
Decision tree-based methods
Rule-based methods
Other methods
Summary
Interpretable post-clustering methods
Decision tree-based methods
Rule-based methods
...and 4 more sections

Figures (2)

Figure 1: Interpretable clustering taxonomy categorized by distinct criteria, most existing methods align with a single category per criterion.
Figure 2: Illustration of four interpretable clustering models applied to the same two-dimensional dataset with three Gaussian clusters. The upper panels display how each model partitions the feature space, while the bottom panels show the feature values used for interpretability.

Interpretable Clustering: A Survey

TL;DR

Abstract

Interpretable Clustering: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (2)