Counterfactual Explanations for k-means and Gaussian Clustering
Georgios Vardakas, Antonia Karra, Evaggelia Pitoura, Aristidis Likas
TL;DR
This work addresses the lack of local explainability for clustering by introducing counterfactual explanations for model-based clustering. It defines counterfactuals as minimal, plausibly located changes that move an instance from its current cluster to a different one, while respecting actionability and plausibility constraints and using Euclidean distance. For $k$-means, counterfactuals admit a closed-form solution that projects the factual onto a linear constraint, yielding efficient, non-iterative updates; for Gaussian clusters with full, diagonal, or spherical covariances, counterfactuals reduce to a single-parameter nonlinear problem solvable with standard methods. Empirical results on synthetic and real datasets show that the proposed CfClust framework produces counterfactuals with smaller distances to the factual and comparable or higher likelihood of belonging to the target cluster, while maintaining fast computation times, making it suitable for large-scale, interactive explainability of clustering outcomes.
Abstract
Counterfactuals have been recognized as an effective approach to explain classifier decisions. Nevertheless, they have not yet been considered in the context of clustering. In this work, we propose the use of counterfactuals to explain clustering solutions. First, we present a general definition for counterfactuals for model-based clustering that includes plausibility and feasibility constraints. Then we consider the counterfactual generation problem for k-means and Gaussian clustering assuming Euclidean distance. Our approach takes as input the factual, the target cluster, a binary mask indicating actionable or immutable features and a plausibility factor specifying how far from the cluster boundary the counterfactual should be placed. In the k-means clustering case, analytical mathematical formulas are presented for computing the optimal solution, while in the Gaussian clustering case (assuming full, diagonal, or spherical covariances) our method requires the numerical solution of a nonlinear equation with a single parameter only. We demonstrate the advantages of our approach through illustrative examples and quantitative experimental comparisons.
