Table of Contents
Fetching ...

Synthesising Counterfactual Explanations via Label-Conditional Gaussian Mixture Variational Autoencoders

Junqi Jiang, Francesco Leofante, Antonio Rago, Francesca Toni

TL;DR

The paper addresses generating counterfactual explanations that balance validity, proximity, plausibility, diversity, and robustness to perturbations. It introduces L-GMVAE, a label-conditioned Gaussian mixture VAE that assigns class-specific latent clusters, and LAPACE, which constructs CE paths by linearly interpolating from a input's latent code to fixed class centroids and decoding the path into the input space. This approach yields CEs that are robust to input changes, while offering a spectrum of recourses from near-proximate to highly robust, and supports actionability constraints via lightweight gradient updates. Empirical results across four tabular datasets demonstrate competitive metrics, perfect robustness to input changes in the best variant, strong plausibility, and fast inference, highlighting practical utility for model-agnostic recourse generation.

Abstract

Counterfactual explanations (CEs) provide recourse recommendations for individuals affected by algorithmic decisions. A key challenge is generating CEs that are robust against various perturbation types (e.g. input and model perturbations) while simultaneously satisfying other desirable properties. These include plausibility, ensuring CEs reside on the data manifold, and diversity, providing multiple distinct recourse options for single inputs. Existing methods, however, mostly struggle to address these multifaceted requirements in a unified, model-agnostic manner. We address these limitations by proposing a novel generative framework. First, we introduce the Label-conditional Gaussian Mixture Variational Autoencoder (L-GMVAE), a model trained to learn a structured latent space where each class label is represented by a set of Gaussian components with diverse, prototypical centroids. Building on this, we present LAPACE (LAtent PAth Counterfactual Explanations), a model-agnostic algorithm that synthesises entire paths of CE points by interpolating from inputs' latent representations to those learned latent centroids. This approach inherently ensures robustness to input changes, as all paths for a given target class converge to the same fixed centroids. Furthermore, the generated paths provide a spectrum of recourse options, allowing users to navigate the trade-off between proximity and plausibility while also encouraging robustness against model changes. In addition, user-specified actionability constraints can also be easily incorporated via lightweight gradient optimisation through the L-GMVAE's decoder. Comprehensive experiments show that LAPACE is computationally efficient and achieves competitive performance across eight quantitative metrics.

Synthesising Counterfactual Explanations via Label-Conditional Gaussian Mixture Variational Autoencoders

TL;DR

The paper addresses generating counterfactual explanations that balance validity, proximity, plausibility, diversity, and robustness to perturbations. It introduces L-GMVAE, a label-conditioned Gaussian mixture VAE that assigns class-specific latent clusters, and LAPACE, which constructs CE paths by linearly interpolating from a input's latent code to fixed class centroids and decoding the path into the input space. This approach yields CEs that are robust to input changes, while offering a spectrum of recourses from near-proximate to highly robust, and supports actionability constraints via lightweight gradient updates. Empirical results across four tabular datasets demonstrate competitive metrics, perfect robustness to input changes in the best variant, strong plausibility, and fast inference, highlighting practical utility for model-agnostic recourse generation.

Abstract

Counterfactual explanations (CEs) provide recourse recommendations for individuals affected by algorithmic decisions. A key challenge is generating CEs that are robust against various perturbation types (e.g. input and model perturbations) while simultaneously satisfying other desirable properties. These include plausibility, ensuring CEs reside on the data manifold, and diversity, providing multiple distinct recourse options for single inputs. Existing methods, however, mostly struggle to address these multifaceted requirements in a unified, model-agnostic manner. We address these limitations by proposing a novel generative framework. First, we introduce the Label-conditional Gaussian Mixture Variational Autoencoder (L-GMVAE), a model trained to learn a structured latent space where each class label is represented by a set of Gaussian components with diverse, prototypical centroids. Building on this, we present LAPACE (LAtent PAth Counterfactual Explanations), a model-agnostic algorithm that synthesises entire paths of CE points by interpolating from inputs' latent representations to those learned latent centroids. This approach inherently ensures robustness to input changes, as all paths for a given target class converge to the same fixed centroids. Furthermore, the generated paths provide a spectrum of recourse options, allowing users to navigate the trade-off between proximity and plausibility while also encouraging robustness against model changes. In addition, user-specified actionability constraints can also be easily incorporated via lightweight gradient optimisation through the L-GMVAE's decoder. Comprehensive experiments show that LAPACE is computationally efficient and achieves competitive performance across eight quantitative metrics.

Paper Structure

This paper contains 13 sections, 6 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Illustration of LAPACE in binary classification. Given a dataset with a trained classifier's predictions (Left), a L-GMVAE is first learned with latent clusters (Gaussian components), capturing the data distribution with the classifier's predictions. In this example, we have 6 Gaussian components (Middle, the coloured areas). Prediction label 0 (1) is associated with Clusters 0-2 (3-5). The cluster centroids (learned Gaussian mixture prior) for classes 0 and 1 are marked with crosses and check marks. Assuming we are computing CEs for a negatively classified point (Left, purple star), LAPACE first performs linear interpolations linking the input's latent representation to each class 1 cluster centroid (Middle, dashed lines). These paths are then decoded to the input space to obtain paths of points, where they terminate at the decoded class 1 cluster centroids (Right).
  • Figure 2: Example CE paths found by LAPACE on MNIST dataset, for an input image of class 5 and a target label of 7. This L-GMVAE has 3 Gaussian clusters per class. Each row is a separate CE path, going from the reconstructions of the original input (the second image from the left, $\tau=0$) to each reconstructed cluster centroid (the last image, when $\tau=1$).
  • Figure 3: Continued MNIST example for the second cluster (the middle row in Figure \ref{['fig:mnist_example']}). A requirement that a dash should not appear in the resulting CE images of class 7 is enforced on every image. For an input image size of $28\times28$, pixel values greater than a threshold of 0.01 at the 13th to 18th rows and the 8th to 14th columns (where the dash appears) are penalised via a loss function.
  • Figure 4: Quantitative evaluation of CEs generated by: NNCE, FACE, RobXCE, DiCE, DRCE, LAPACE-First, LAPACE-Middle, LAPACE-Last. Each subplot is the quantitative comparison of all methods on one dataset-classifier combination and on one evaluation metric. The arrows following each metric indicate that higher or lower values are considered better. For diversity, the first three methods find only one CE per input for which the evaluation metric cannot be computed. Therefore, they are assigned a value of -1.
  • Figure 5: Reconstructed cluster centroids for the L-GMVAE model on MNIST dataset used in the examples.
  • ...and 1 more figures