Table of Contents
Fetching ...

Refining Counterfactual Explanations With Joint-Distribution-Informed Shapley Towards Actionable Minimality

Lei You, Yijun Bian, Lele Cao

TL;DR

This work addresses the problem of generating actionable minimal counterfactual explanations by reducing unnecessary feature changes without sacrificing validity. It introduces a versatile COLA framework that couples counterfactual generation with joint-distribution-informed Shapley attributions, leveraging an optimal-transport plan to align factual and counterfactual data. Theoretical results bound the counterfactual effect by the transport cost under a Lipschitz model, and empirical results show substantial reductions in required actions while preserving impact across diverse datasets and models. The findings emphasize that a carefully learned joint distribution, rather than exact instance-level alignments, yields more effective and actionable explanations with broad practical implications for trustworthy, scalable AI interpretability.

Abstract

Counterfactual explanations (CE) identify data points that closely resemble the observed data but produce different machine learning (ML) model outputs, offering critical insights into model decisions. Despite the diverse scenarios, goals and tasks to which they are tailored, existing CE methods often lack actionable efficiency because of unnecessary feature changes included within the explanations that are presented to users and stakeholders. We address this problem by proposing a method that minimizes the required feature changes while maintaining the validity of CE, without imposing restrictions on models or CE algorithms, whether instance- or group-based. The key innovation lies in computing a joint distribution between observed and counterfactual data and leveraging it to inform Shapley values for feature attributions (FA). We demonstrate that optimal transport (OT) effectively derives this distribution, especially when the alignment between observed and counterfactual data is unclear in used CE methods. Additionally, a counterintuitive finding is uncovered: it may be misleading to rely on an exact alignment defined by the CE generation mechanism in conducting FA. Our proposed method is validated on extensive experiments across multiple datasets, showcasing its effectiveness in refining CE towards greater actionable efficiency.

Refining Counterfactual Explanations With Joint-Distribution-Informed Shapley Towards Actionable Minimality

TL;DR

This work addresses the problem of generating actionable minimal counterfactual explanations by reducing unnecessary feature changes without sacrificing validity. It introduces a versatile COLA framework that couples counterfactual generation with joint-distribution-informed Shapley attributions, leveraging an optimal-transport plan to align factual and counterfactual data. Theoretical results bound the counterfactual effect by the transport cost under a Lipschitz model, and empirical results show substantial reductions in required actions while preserving impact across diverse datasets and models. The findings emphasize that a carefully learned joint distribution, rather than exact instance-level alignments, yields more effective and actionable explanations with broad practical implications for trustworthy, scalable AI interpretability.

Abstract

Counterfactual explanations (CE) identify data points that closely resemble the observed data but produce different machine learning (ML) model outputs, offering critical insights into model decisions. Despite the diverse scenarios, goals and tasks to which they are tailored, existing CE methods often lack actionable efficiency because of unnecessary feature changes included within the explanations that are presented to users and stakeholders. We address this problem by proposing a method that minimizes the required feature changes while maintaining the validity of CE, without imposing restrictions on models or CE algorithms, whether instance- or group-based. The key innovation lies in computing a joint distribution between observed and counterfactual data and leveraging it to inform Shapley values for feature attributions (FA). We demonstrate that optimal transport (OT) effectively derives this distribution, especially when the alignment between observed and counterfactual data is unclear in used CE methods. Additionally, a counterintuitive finding is uncovered: it may be misleading to rely on an exact alignment defined by the CE generation mechanism in conducting FA. Our proposed method is validated on extensive experiments across multiple datasets, showcasing its effectiveness in refining CE towards greater actionable efficiency.
Paper Structure (27 sections, 7 theorems, 51 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 27 sections, 7 theorems, 51 equations, 8 figures, 3 tables, 1 algorithm.

Key Result

Theorem 4.1

Consider the $1$-Wasserstein divergence $W_1$, i.e. $W_1(f(\mathbf{x}),\mathbf{y}^*)=\min_{\bm{\pi}\in\Pi}\sum_{i=1}^n \sum_{j=1}^m \pi_{ij} \left| f(\mathbf{x}_i) - \mathbf{y}^*_j \right|$. Suppose the counterfactual outcome $\mathbf{y}^*$ is fully achieved by $\mathbf{r}$, i.e. $\mathbf{y}^*_j = f Namely, $\mathbf{p}_{\text{OT}}$ minimizes the upper bound of $W_1(f(\mathbf{x}),\mathbf{y}^*)$, wh

Figures (8)

  • Figure 1: [Example: User engagement on an e-commerce platform] A platform aims to increase user registrations. The platform has collected data on user interactions, such as the amount of money spent (), the number of clicks (), and whether the user has registered (). In the original data ($\mathbf{x}$), no users are registered. Action plans $\mathbf{z}'$ and $\mathbf{z}"$ adjust user characteristics to achieve the desired outcome ($\mathbf{y}^*$) of full registration. Both plans achieve a half counterfactual effect, but $\mathbf{z}"$ requires fewer modifications compared to $\mathbf{z}'$. This benefits customers by preserving their natural interaction patterns, leading to a better user experience. For business operators, fewer modifications result in more efficient resource allocation and cost-effective strategies, making the improvements easier to implement and more sustainable.
  • Figure 2: [An illustration of COLA] This figure shows how COLA gets $\mathbf{c}$ and $\mathbf{z}$ for equation \ref{['eq:main_problem']}. We use $A^{\text{max}}_{\text{Value}}$ for illustration in line \ref{['alg:cola-A_value']} due to its simplicity. In lines 6--16, we assume $C=2$, and the sampling yields exactly two positions for modfications according to the probability matrix $\bm{\varphi}$.
  • Figure 3: $D(f(\mathbf{z}),\mathbf{y}^*)$ vs. allowed actions $C$. Experiments are with 100 runs. The shadows show the 99.9% confidence intervals. $A^{\text{avg}}_{\text{Value}}$ is used for HELOC and COMPAS, and $A^{\text{max}}_{\text{Value}}$ is used for German Credit and Hotel Bookings. The legend inside "RnDForest, $A_{\text{CE}}=$KNN" applies to all plots.
  • Figure 4: [German Credit] $D(f(z),y^*)$ vs. allowed actions $C$, with $D$ being .
  • Figure 5: [HELOC] $D(f(\mathbf{z}),\mathbf{y}^*)$ vs. allowed actions $C$. Experiments are with 100 runs. The shadows show the 99.9% confidence intervals. The legends apply to all plots. $A^{\text{avg}}_{\text{Value}}$ is used.
  • ...and 3 more figures

Theorems & Definitions (11)

  • Theorem 4.1: Towards Counterfactual Effect
  • Theorem 4.2: Interventional Effect of
  • Theorem 5.1: Counterfactual Proximity
  • Theorem B.1
  • proof
  • Theorem C.1: Theorem \ref{['thm:lipschitz']} in the main text
  • proof
  • Theorem D.1: Theorem \ref{['thm:intervention']} in the main text
  • proof
  • Theorem E.1: Theorem \ref{['thm:z_distance']} in the main text
  • ...and 1 more