Boundary-Aware Uncertainty for Feature Attribution Explainers

Davin Hill; Aria Masoomi; Max Torop; Sandesh Ghimire; Jennifer Dy

Boundary-Aware Uncertainty for Feature Attribution Explainers

Davin Hill, Aria Masoomi, Max Torop, Sandesh Ghimire, Jennifer Dy

TL;DR

This work proposes the Gaussian Process Explanation UnCertainty framework, which generates a unified uncertainty estimate combining decision boundary-aware uncertainty with explanation function approximation uncertainty, and introduces a novel geodesic-based kernel, which captures the complexity of the target black-box decision boundary.

Abstract

Post-hoc explanation methods have become a critical tool for understanding black-box classifiers in high-stakes applications. However, high-performing classifiers are often highly nonlinear and can exhibit complex behavior around the decision boundary, leading to brittle or misleading local explanations. Therefore there is an impending need to quantify the uncertainty of such explanation methods in order to understand when explanations are trustworthy. In this work we propose the Gaussian Process Explanation UnCertainty (GPEC) framework, which generates a unified uncertainty estimate combining decision boundary-aware uncertainty with explanation function approximation uncertainty. We introduce a novel geodesic-based kernel, which captures the complexity of the target black-box decision boundary. We show theoretically that the proposed kernel similarity increases with decision boundary complexity. The proposed framework is highly flexible; it can be used with any black-box classifier and feature attribution method. Empirical results on multiple tabular and image datasets show that the GPEC uncertainty estimate improves understanding of explanations as compared to existing methods.

Boundary-Aware Uncertainty for Feature Attribution Explainers

TL;DR

Abstract

Paper Structure (43 sections, 2 theorems, 48 equations, 14 figures, 6 tables, 2 algorithms)

This paper contains 43 sections, 2 theorems, 48 equations, 14 figures, 6 tables, 2 algorithms.

INTRODUCTION
RELATED WORKS
UNCERTAINTY FRAMEWORK FOR EXPLAINERS
WEG KERNEL
Geometry of the Decision Boundary
Weighting Decision Boundary Samples
WEG Kernel Approximation
GPEC Algorithm
EXPERIMENTS
Experiment Setup
Uncertainty Visualization
Regularization Test
GPEC Ablation Test
Time Complexity
Additional Results
...and 28 more sections

Key Result

Theorem 1

Given two points $x,x' \in \mathcal{X} \cap \mathcal{M}_F$, then $\lim_{\rho \rightarrow \infty} k_{\textrm{WEG}}(x,x') = k_{\textrm{EG}}(x, x')$

Figures (14)

Figure 1: Illustrative example of potential pitfalls when relying on local explainers for samples near complex regions of the decision boundary (left) as compared with a smoothed decision boundary (right).
Figure 2: Overview of the GPEC framework. GPEC takes samples from the classifier's decision boundary plus (possibly noisy) explanations and fits a GP model with the novel WEG Kernel. The GPEC estimate incorporates both the uncertainty derived from the decision boundary complexity and also the explanation approximation uncertainty from the explainer.
Figure 3: Consider a classifier with DB defined as $\mathcal{M}_{0} = \{\left(x_1, f(x_1)\right): x_1 \in \mathbb{R}_{>0}\}$ where $f(x_1) = 2\cos(\frac{10}{x_1})$. (A) Illustration of geodesic distance $d_{geo}(m, m')$ between two points $m', m \in \mathcal{M}_{0}$. (B) Evaluation of the WEG kernel for $\mathcal{M}_{0}$ (top) and a linear DB (below). The gray region highlights the set $\{x' : k(x,x') \geq 0.9\}$ for a given $x$ (red). This region increases as the local DB become more linear. (C) During WEG approximation, we calculate Euclidean distances between $x,x'$ (red, green) and DB samples $m_1,...,m_J \in \mathcal{M}_{0}$ (blue). When appropriately normalized (Eq. \ref{['eq:q_weighting']}), this acts as a weighting for each element of the EG kernel.
Figure 4: Visualization of estimated explanation uncertainty for different datasets and competing methods. The heatmap represents uncertainty level for a grid of explanations for the x-axis feature; darker heatmap regions represent higher uncertainty. The black line represents the black-box DB, and red points represent training samples. The heatmap shows that GPEC uncertainty is elevated for samples near complex decision boundaries. In contrast, heatmaps for BayesSHAP, BayesLIME, and CXPlain are relatively uniform.
Figure 5: Average uncertainty values for different regions in Synthetic, binned by $x_1$. Synthetic is designed to have higher DB complexity for $x_1 \in [-4,4]$, which is reflected by high GPEC uncertainty in bins $(-4,2]$, $(-2,0]$, $(0,2]$, $(2,4]$. Other methods do not capture the high DB complexity for $x_1 \in [-4,4]$.
...and 9 more figures

Theorems & Definitions (3)

Theorem 1
Definition 1: Manifold Perturbation
Theorem 2

Boundary-Aware Uncertainty for Feature Attribution Explainers

TL;DR

Abstract

Boundary-Aware Uncertainty for Feature Attribution Explainers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (3)