Local and Regional Counterfactual Rules: Summarized and Robust Recourses
Salim I. Amoukou, Nicolas J. B Brunel
TL;DR
This work addresses limitations of traditional Counterfactual Explanations by introducing Local Counterfactual Rules ($\text{L-CR}$) and Regional Counterfactual Rules ($\text{R-CR}$) that yield sparse, data-faithful recourses with high probability, derived from Random Forest partitions. Core constructs include $\text{CDP}$ (Counterfactual Decision Probability) and $\text{CRP}$ (Counterfactual Rule Probability), estimated via Projected Forests and Regional RF to quantify the likelihood of changing a decision within high-density regions, for both classification and regression. The authors present an end-to-end pipeline: identify Minimal Divergent Explanations, derive maximal counterfactual rectangles from RF partitions, and sample counterfactuals from these rules using energy-based methods; they demonstrate improved accuracy and plausibility over baselines and robust performance under noisy recourses. Empirical results on regression (California housing) and classification (COMPAS, Diabetes, NHANES) show that CRs provide more reliable, sparse, and plausible recourses while remaining model-agnostic, with an accompanying Python package for practical use.
Abstract
Counterfactual Explanations (CE) face several unresolved challenges, such as ensuring stability, synthesizing multiple CEs, and providing plausibility and sparsity guarantees. From a more practical point of view, recent studies [Pawelczyk et al., 2022] show that the prescribed counterfactual recourses are often not implemented exactly by individuals and demonstrate that most state-of-the-art CE algorithms are very likely to fail in this noisy environment. To address these issues, we propose a probabilistic framework that gives a sparse local counterfactual rule for each observation, providing rules that give a range of values capable of changing decisions with high probability. These rules serve as a summary of diverse counterfactual explanations and yield robust recourses. We further aggregate these local rules into a regional counterfactual rule, identifying shared recourses for subgroups of the data. Our local and regional rules are derived from the Random Forest algorithm, which offers statistical guarantees and fidelity to data distribution by selecting recourses in high-density regions. Moreover, our rules are sparse as we first select the smallest set of variables having a high probability of changing the decision. We have conducted experiments to validate the effectiveness of our counterfactual rules in comparison to standard CE and recent similar attempts. Our methods are available as a Python package.
