Table of Contents
Fetching ...

Local and Regional Counterfactual Rules: Summarized and Robust Recourses

Salim I. Amoukou, Nicolas J. B Brunel

TL;DR

This work addresses limitations of traditional Counterfactual Explanations by introducing Local Counterfactual Rules ($\text{L-CR}$) and Regional Counterfactual Rules ($\text{R-CR}$) that yield sparse, data-faithful recourses with high probability, derived from Random Forest partitions. Core constructs include $\text{CDP}$ (Counterfactual Decision Probability) and $\text{CRP}$ (Counterfactual Rule Probability), estimated via Projected Forests and Regional RF to quantify the likelihood of changing a decision within high-density regions, for both classification and regression. The authors present an end-to-end pipeline: identify Minimal Divergent Explanations, derive maximal counterfactual rectangles from RF partitions, and sample counterfactuals from these rules using energy-based methods; they demonstrate improved accuracy and plausibility over baselines and robust performance under noisy recourses. Empirical results on regression (California housing) and classification (COMPAS, Diabetes, NHANES) show that CRs provide more reliable, sparse, and plausible recourses while remaining model-agnostic, with an accompanying Python package for practical use.

Abstract

Counterfactual Explanations (CE) face several unresolved challenges, such as ensuring stability, synthesizing multiple CEs, and providing plausibility and sparsity guarantees. From a more practical point of view, recent studies [Pawelczyk et al., 2022] show that the prescribed counterfactual recourses are often not implemented exactly by individuals and demonstrate that most state-of-the-art CE algorithms are very likely to fail in this noisy environment. To address these issues, we propose a probabilistic framework that gives a sparse local counterfactual rule for each observation, providing rules that give a range of values capable of changing decisions with high probability. These rules serve as a summary of diverse counterfactual explanations and yield robust recourses. We further aggregate these local rules into a regional counterfactual rule, identifying shared recourses for subgroups of the data. Our local and regional rules are derived from the Random Forest algorithm, which offers statistical guarantees and fidelity to data distribution by selecting recourses in high-density regions. Moreover, our rules are sparse as we first select the smallest set of variables having a high probability of changing the decision. We have conducted experiments to validate the effectiveness of our counterfactual rules in comparison to standard CE and recent similar attempts. Our methods are available as a Python package.

Local and Regional Counterfactual Rules: Summarized and Robust Recourses

TL;DR

This work addresses limitations of traditional Counterfactual Explanations by introducing Local Counterfactual Rules () and Regional Counterfactual Rules () that yield sparse, data-faithful recourses with high probability, derived from Random Forest partitions. Core constructs include (Counterfactual Decision Probability) and (Counterfactual Rule Probability), estimated via Projected Forests and Regional RF to quantify the likelihood of changing a decision within high-density regions, for both classification and regression. The authors present an end-to-end pipeline: identify Minimal Divergent Explanations, derive maximal counterfactual rectangles from RF partitions, and sample counterfactuals from these rules using energy-based methods; they demonstrate improved accuracy and plausibility over baselines and robust performance under noisy recourses. Empirical results on regression (California housing) and classification (COMPAS, Diabetes, NHANES) show that CRs provide more reliable, sparse, and plausible recourses while remaining model-agnostic, with an accompanying Python package for practical use.

Abstract

Counterfactual Explanations (CE) face several unresolved challenges, such as ensuring stability, synthesizing multiple CEs, and providing plausibility and sparsity guarantees. From a more practical point of view, recent studies [Pawelczyk et al., 2022] show that the prescribed counterfactual recourses are often not implemented exactly by individuals and demonstrate that most state-of-the-art CE algorithms are very likely to fail in this noisy environment. To address these issues, we propose a probabilistic framework that gives a sparse local counterfactual rule for each observation, providing rules that give a range of values capable of changing decisions with high probability. These rules serve as a summary of diverse counterfactual explanations and yield robust recourses. We further aggregate these local rules into a regional counterfactual rule, identifying shared recourses for subgroups of the data. Our local and regional rules are derived from the Random Forest algorithm, which offers statistical guarantees and fidelity to data distribution by selecting recourses in high-density regions. Moreover, our rules are sparse as we first select the smallest set of variables having a high probability of changing the decision. We have conducted experiments to validate the effectiveness of our counterfactual rules in comparison to standard CE and recent similar attempts. Our methods are available as a Python package.
Paper Structure (13 sections, 2 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 13 sections, 2 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Illustration of local and regional Counterfactual Rules for a fictitious dataset with four variables: Age, Salary, Sex, and HoursPerWeek. Local rules change a single instance's decision, while regional rules apply to a sub-population. Blue indicates the suggested rules for changing decisions.
  • Figure 2: Illustration of the 4-stages in our methodology for computing sparse counterfactuals
  • Figure 3: (a) Partition of the Random Forest, (b) Partition of the Projected Random Forest when we condition given $X_0$, i.e., ignoring the splits on $X_1$, (c) The optimal Counterfactual Rule of $\boldsymbol{x}$ when we condition given $X_0=x_0$ is the green region.
  • Figure 4: Representation of a simple decision tree (right Figure) and its associated partition (left Figure). The gray part in the partition corresponds to the region $[2, \; 3.5] \times [1, 2]$

Theorems & Definitions (3)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3