Ranking Counterfactual Explanations
Suryani Lim, Henri Prade, Gilles Richard
TL;DR
The paper formalizes counterfactual explanations and links them to factual explanations, then introduces a ranking framework to identify an optimal counterfactual via counterfactual power $cf_S(b,a) = |S_b ∩ B(a,b)|$ and a hyperball $B(a,b)$. It defines three quality metrics—typicality, capacity, and universality—to assess counterfactual explanations and validates the approach on 12 real-world categorical datasets, finding that a unique optimal counterfactual often exists and outperforms random minimal ones. The work is model-agnostic, provides a principled duality between factual and counterfactual explanations, and offers practical guidance for presenting robust, generalizable counterfactuals while acknowledging immutability constraints and GDPR considerations. Future work includes human-centered validation and deeper probabilistic analysis of the proposed metrics.
Abstract
AI-driven outcomes can be challenging for end-users to understand. Explanations can address two key questions: "Why this outcome?" (factual) and "Why not another?" (counterfactual). While substantial efforts have been made to formalize factual explanations, a precise and comprehensive study of counterfactual explanations is still lacking. This paper proposes a formal definition of counterfactual explanations, proving some properties they satisfy, and examining the relationship with factual explanations. Given that multiple counterfactual explanations generally exist for a specific case, we also introduce a rigorous method to rank these counterfactual explanations, going beyond a simple minimality condition, and to identify the optimal ones. Our experiments with 12 real-world datasets highlight that, in most cases, a single optimal counterfactual explanation emerges. We also demonstrate, via three metrics, that the selected optimal explanation exhibits higher representativeness and can explain a broader range of elements than a random minimal counterfactual. This result highlights the effectiveness of our approach in identifying more robust and comprehensive counterfactual explanations.
