Calibrated Explanations: with Uncertainty Information and Counterfactuals
Helena Lofstrom, Tuwe Lofstrom, Ulf Johansson, Cecilia Sonstrod
TL;DR
Calibrated Explanations (CE) address the unreliability of local feature importance by calibrating the underlying model with Venn-Abers predictors and providing exact, uncertainty-aware feature weights. CE delivers both factual and counterfactual explanations with calibrated probability estimates and corresponding uncertainty intervals, forming a model-agnostic, rule-based explanation framework. The authors validate CE on 25 binary classification datasets, demonstrating stability and robustness alongside fast performance, and compare it to state-of-the-art methods, highlighting CE's unique ability to provide comprehensive explanations with uncertainty for both factual and counterfactual scenarios. The work is released as an open-source package, enabling practitioners to generate calibrated explanations and visualize uncertainty, with future work aimed at real-world user studies and extensions to other data modalities.
Abstract
While local explanations for AI models can offer insights into individual predictions, such as feature importance, they are plagued by issues like instability. The unreliability of feature weights, often skewed due to poorly calibrated ML models, deepens these challenges. Moreover, the critical aspect of feature importance uncertainty remains mostly unaddressed in Explainable AI (XAI). The novel feature importance explanation method presented in this paper, called Calibrated Explanations (CE), is designed to tackle these issues head-on. Built on the foundation of Venn-Abers, CE not only calibrates the underlying model but also delivers reliable feature importance explanations with an exact definition of the feature weights. CE goes beyond conventional solutions by addressing output uncertainty. It accomplishes this by providing uncertainty quantification for both feature weights and the model's probability estimates. Additionally, CE is model-agnostic, featuring easily comprehensible conditional rules and the ability to generate counterfactual explanations with embedded uncertainty quantification. Results from an evaluation with 25 benchmark datasets underscore the efficacy of CE, making it stand as a fast, reliable, stable, and robust solution.
