Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles

Alexandre Forel; Axel Parmentier; Thibaut Vidal

Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles

Alexandre Forel, Axel Parmentier, Thibaut Vidal

TL;DR

This work addresses the fragility of counterfactual explanations for classifiers that are implemented as randomized ensembles. It formulates robustness to algorithmic uncertainty as a probabilistic constraint and derives a simple deterministic threshold $\tau(N,\alpha)$, with $p_{N,\\alpha}^* = g_N^{-1}(\\alpha)$, that yields robust counterfactual explanations with the same computational cost as naive methods. The authors provide theoretical guarantees for ensembles of convex base learners and finite-sample bounds for convex approximations, along with practical sample-average approximations (Direct-SAA and Robust-SAA). Empirical results on real datasets show that naive counterfactuals lack robustness (validity often below $0.5$, sometimes near $0.2$), while the proposed methods deliver high robustness with modest increases in counterfactual distance, and robustness correlates with feature predictive importance. The framework offers a principled, scalable approach to provide reliable algorithmic recourse in the presence of ensemble randomness, with clear guidance on when robust counterfactual explanations are necessary.

Abstract

Counterfactual explanations describe how to modify a feature vector in order to flip the outcome of a trained classifier. Obtaining robust counterfactual explanations is essential to provide valid algorithmic recourse and meaningful explanations. We study the robustness of explanations of randomized ensembles, which are always subject to algorithmic uncertainty even when the training data is fixed. We formalize the generation of robust counterfactual explanations as a probabilistic problem and show the link between the robustness of ensemble models and the robustness of base learners. We develop a practical method with good empirical performance and support it with theoretical guarantees for ensembles of convex base learners. Our results show that existing methods give surprisingly low robustness: the validity of naive counterfactuals is below $50\%$ on most data sets and can fall to $20\%$ on problems with many features. In contrast, our method achieves high robustness with only a small increase in the distance from counterfactual explanations to their initial observations.

Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles

TL;DR

, with

, that yields robust counterfactual explanations with the same computational cost as naive methods. The authors provide theoretical guarantees for ensembles of convex base learners and finite-sample bounds for convex approximations, along with practical sample-average approximations (Direct-SAA and Robust-SAA). Empirical results on real datasets show that naive counterfactuals lack robustness (validity often below

, sometimes near

), while the proposed methods deliver high robustness with modest increases in counterfactual distance, and robustness correlates with feature predictive importance. The framework offers a principled, scalable approach to provide reliable algorithmic recourse in the presence of ensemble randomness, with clear guidance on when robust counterfactual explanations are necessary.

Abstract

on most data sets and can fall to

on problems with many features. In contrast, our method achieves high robustness with only a small increase in the distance from counterfactual explanations to their initial observations.

Paper Structure (29 sections, 8 theorems, 34 equations, 12 figures, 3 tables)

This paper contains 29 sections, 8 theorems, 34 equations, 12 figures, 3 tables.

Introduction
Problem Statement and Background
Classification Ensembles
Counterfactual Explanations of Ensembles
Algorithmic Uncertainty, Validity and Robustness
Robust Counterfactual Explanations
Reformulating the Robustness Constraint
Sample-Average Approximations
Robustness Guarantees for Ensembles of Convex Learners
Experimental Results
Achieving Robust Counterfactual Explanations
Feature Importance and Robustness
Conclusions
Supplementary Material: Proofs
Proof of Lemma \ref{['lem:gN']}
...and 14 more sections

Key Result

lemma thmcounterlemma

Given $N \in \mathbb{N}$, the map $g_N: [0, 1] \to [0, 1], p \mapsto B\left(N/2 ; N, p\right)$ is decreasing and invertible.

Figures (12)

Figure 1: Sensitivity of the robustness threshold $p_{N, \alpha}^*$.
Figure 2: Initial observation and counterfactual explanations for increasing robustness target ($1-\alpha$).
Figure 3: Validity of robust counterfactuals as a function of the robustness target $(1-\alpha)$.
Figure 4: Trade-off between the distance and robustness of counterfactual explanations.
Figure 5: Average number of features changed for varying robustness targets $(1-\alpha)$.
...and 7 more figures

Theorems & Definitions (11)

definition thmcounterdefinition: Validity
definition thmcounterdefinition: Algorithmic robustness
lemma thmcounterlemma
proposition thmcounterproposition
proposition thmcounterproposition
lemma thmcounterlemma
proposition thmcounterproposition
proposition thmcounterproposition: Asymptotic consistency
proposition thmcounterproposition: Finite-sample guarantees
lemma thmcounterlemma
...and 1 more

Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles

TL;DR

Abstract

Don't Explain Noise: Robust Counterfactuals for Randomized Ensembles

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (12)

Theorems & Definitions (11)