Table of Contents
Fetching ...

Counterfactual Explanations for Clustering Models

Aurora Spagnol, Kacper Sokol, Pietro Barbiero, Marc Langheinrich, Martin Gjoreski

TL;DR

This work proposes a new, model-agnostic technique for explaining clustering algorithms with counterfactual statements that builds upon a state-of-the-art Bayesian counterfactual generator for supervised learning to deliver high-quality explanations.

Abstract

Clustering algorithms rely on complex optimisation processes that may be difficult to comprehend, especially for individuals who lack technical expertise. While many explainable artificial intelligence techniques exist for supervised machine learning, unsupervised learning -- and clustering in particular -- has been largely neglected. To complicate matters further, the notion of a ``true'' cluster is inherently challenging to define. These facets of unsupervised learning and its explainability make it difficult to foster trust in such methods and curtail their adoption. To address these challenges, we propose a new, model-agnostic technique for explaining clustering algorithms with counterfactual statements. Our approach relies on a novel soft-scoring method that captures the spatial information utilised by clustering models. It builds upon a state-of-the-art Bayesian counterfactual generator for supervised learning to deliver high-quality explanations. We evaluate its performance on five datasets and two clustering algorithms, and demonstrate that introducing soft scores to guide counterfactual search significantly improves the results.

Counterfactual Explanations for Clustering Models

TL;DR

This work proposes a new, model-agnostic technique for explaining clustering algorithms with counterfactual statements that builds upon a state-of-the-art Bayesian counterfactual generator for supervised learning to deliver high-quality explanations.

Abstract

Clustering algorithms rely on complex optimisation processes that may be difficult to comprehend, especially for individuals who lack technical expertise. While many explainable artificial intelligence techniques exist for supervised machine learning, unsupervised learning -- and clustering in particular -- has been largely neglected. To complicate matters further, the notion of a ``true'' cluster is inherently challenging to define. These facets of unsupervised learning and its explainability make it difficult to foster trust in such methods and curtail their adoption. To address these challenges, we propose a new, model-agnostic technique for explaining clustering algorithms with counterfactual statements. Our approach relies on a novel soft-scoring method that captures the spatial information utilised by clustering models. It builds upon a state-of-the-art Bayesian counterfactual generator for supervised learning to deliver high-quality explanations. We evaluate its performance on five datasets and two clustering algorithms, and demonstrate that introducing soft scores to guide counterfactual search significantly improves the results.
Paper Structure (26 sections, 6 equations, 2 figures, 7 tables)

This paper contains 26 sections, 6 equations, 2 figures, 7 tables.

Figures (2)

  • Figure 1: Candidate counterfactuals for a $k$-means++ clustering model built on the Biodeg dataset. The colour-coding captures the quality of the counterfactuals from blue -- worse -- to red -- better. x-axis represents Gower distance; y-axis shows distance in the prediction space; and z-axis is the proportion of unchanged features (the number of features that were not tweaked divided by the total number of features).
  • Figure 2: Representation of scores for the distance-based soft-scoring technique applied to a two-dimensional toy example. The target cluster is depicted in purple; large markers represent cluster centroids; and $x^\star$ is the initial instance. $S_x$ is the distance between $x^\star$ and the candidate $\mathit{CF}(x^\star)$; $S_f$ is the number of features changed (in this example $1$); and $S_y$ is the normalised distance to the centroid $C_{t}$ of the target cluster (see Equation \ref{['eq:score']}).