Calibrated Explanations for Regression

Tuwe Löfström; Helena Löfström; Ulf Johansson; Cecilia Sönströd; Rudy Matela

Calibrated Explanations for Regression

Tuwe Löfström, Helena Löfström, Ulf Johansson, Cecilia Sönströd, Rudy Matela

TL;DR

The paper addresses the need for uncertainty-aware explanations in regression by extending Calibrated Explanations from classification to standard and probabilistic regression. It leverages Conformal Predictive Systems (CPS) to provide calibrated predictions and uncertainty quantification, while using Venn-Abers (VA) style calibration for classification components where appropriate. The authors introduce factual and counterfactual explanations for both standard regression (with prediction intervals) and probabilistic regression (probabilities of exceeding a threshold), including conjunctive rules, and provide an open-source Python implementation. Through experiments on the California Housing dataset, the method demonstrates fast, reliable, and robust standard regression explanations, alongside a flexible probabilistic variant that trades some speed for threshold-based probabilistic insight. This work broadens the applicability of calibrated explanations to regression, enabling uncertainty-aware decision making in a wider range of domains.

Abstract

Artificial Intelligence (AI) is often an integral part of modern decision support systems. The best-performing predictive models used in AI-based decision support systems lack transparency. Explainable Artificial Intelligence (XAI) aims to create AI systems that can explain their rationale to human users. Local explanations in XAI can provide information about the causes of individual predictions in terms of feature importance. However, a critical drawback of existing local explanation methods is their inability to quantify the uncertainty associated with a feature's importance. This paper introduces an extension of a feature importance explanation method, Calibrated Explanations, previously only supporting classification, with support for standard regression and probabilistic regression, i.e., the probability that the target is above an arbitrary threshold. The extension for regression keeps all the benefits of Calibrated Explanations, such as calibration of the prediction from the underlying model with confidence intervals, uncertainty quantification of feature importance, and allows both factual and counterfactual explanations. Calibrated Explanations for standard regression provides fast, reliable, stable, and robust explanations. Calibrated Explanations for probabilistic regression provides an entirely new way of creating probabilistic explanations from any ordinary regression model, allowing dynamic selection of thresholds. The method is model agnostic with easily understood conditional rules. An implementation in Python is freely available on GitHub and for installation using both pip and conda, making the results in this paper easily replicable.

Calibrated Explanations for Regression

TL;DR

Abstract

Paper Structure (30 sections, 9 equations, 13 figures, 3 tables)

This paper contains 30 sections, 9 equations, 13 figures, 3 tables.

Introduction
Background
Post-Hoc Explanation Methods
Essential Characteristics of Explanations
Explanations for classification and regression
Venn-Abers predictors
Calibrated Explanations for Classification
Factual Calibrated Explanations for Classification
Counterfactual Calibrated Explanations for Classification
Conjunctive Calibrated Explanations
Calibrated Explanations for Regression
Conformal Predictive Systems
Factual and Counterfactual Explanations for Regression
Factual and Counterfactual Probabilistic Calibrated Explanations for Regression
Properties of Calibrated Explanations for Regression
...and 15 more sections

Figures (13)

Figure 1: A CPD with three different intervals representing $90\%$ confidence are defined: Lower-bounded interval: more than the $10^{th}$ percentile; Two-sided interval: between the $5^{th}$ and the $95^{th}$ percentiles; Upper-bounded interval: less than the $90^{th}$ percentile. The black dotted lines indicate how to determine the probability of the true target being smaller than 0.5, which in this case would be approximately $80\%$.
Figure 2: Code example on using calibrated-explanations for regression.
Figure 3: Code example on using calibrated-explanations with normalization.
Figure 4: A regular plot for the California Housing data set. The top-bar illustrates the median (the red line) and a confidence interval (the light red area), defined by the $5^{th}$ and the $95^{th}$ percentiles. The subplot below visualizes the weights associated with each feature. The weights indicate how much that rule contributes to the prediction. Negative weights in red indicate a negative impact on the prediction whereas positive weights in blue indicate a positive impact.
Figure 5: The top bars of one-sided plots with confidence intervals bounded by the $90^{th}$ upper percentile (Fig. \ref{['fig:simple_one_1']}) and the $10^{th}$ lower percentile (Fig. \ref{['fig:simple_one_2']}). The red solid line represents the median. The weights (and consequently the entire subplot visualizing weights) are the same for these one-sided explanations as in Fig. \ref{['fig:housing_simple']}.
...and 8 more figures

Calibrated Explanations for Regression

TL;DR

Abstract

Calibrated Explanations for Regression

Authors

TL;DR

Abstract

Table of Contents

Figures (13)