Table of Contents
Fetching ...

Informative Perturbation Selection for Uncertainty-Aware Post-hoc Explanations

Sumedha Chugh, Ranjitha Prasad, Nazreen Shah

Abstract

Trust and ethical concerns due to the widespread deployment of opaque machine learning (ML) models motivating the need for reliable model explanations. Post-hoc model-agnostic explanation methods addresses this challenge by learning a surrogate model that approximates the behavior of the deployed black-box ML model in the locality of a sample of interest. In post-hoc scenarios, neither the underlying model parameters nor the training are available, and hence, this local neighborhood must be constructed by generating perturbed inputs in the neighborhood of the sample of interest, and its corresponding model predictions. We propose \emph{Expected Active Gain for Local Explanations} (\texttt{EAGLE}), a post-hoc model-agnostic explanation framework that formulates perturbation selection as an information-theoretic active learning problem. By adaptively sampling perturbations that maximize the expected information gain, \texttt{EAGLE} efficiently learns a linear surrogate explainable model while producing feature importance scores along with the uncertainty/confidence estimates. Theoretically, we establish that cumulative information gain scales as $\mathcal{O}(d \log t)$, where $d$ is the feature dimension and $t$ represents the number of samples, and that the sample complexity grows linearly with $d$ and logarithmically with the confidence parameter $1/δ$. Empirical results on tabular and image datasets corroborate our theoretical findings and demonstrate that \texttt{EAGLE} improves explanation reproducibility across runs, achieves higher neighborhood stability, and improves perturbation sample quality as compared to state-of-the-art baselines such as Tilia, US-LIME, GLIME and BayesLIME.

Informative Perturbation Selection for Uncertainty-Aware Post-hoc Explanations

Abstract

Trust and ethical concerns due to the widespread deployment of opaque machine learning (ML) models motivating the need for reliable model explanations. Post-hoc model-agnostic explanation methods addresses this challenge by learning a surrogate model that approximates the behavior of the deployed black-box ML model in the locality of a sample of interest. In post-hoc scenarios, neither the underlying model parameters nor the training are available, and hence, this local neighborhood must be constructed by generating perturbed inputs in the neighborhood of the sample of interest, and its corresponding model predictions. We propose \emph{Expected Active Gain for Local Explanations} (\texttt{EAGLE}), a post-hoc model-agnostic explanation framework that formulates perturbation selection as an information-theoretic active learning problem. By adaptively sampling perturbations that maximize the expected information gain, \texttt{EAGLE} efficiently learns a linear surrogate explainable model while producing feature importance scores along with the uncertainty/confidence estimates. Theoretically, we establish that cumulative information gain scales as , where is the feature dimension and represents the number of samples, and that the sample complexity grows linearly with and logarithmically with the confidence parameter . Empirical results on tabular and image datasets corroborate our theoretical findings and demonstrate that \texttt{EAGLE} improves explanation reproducibility across runs, achieves higher neighborhood stability, and improves perturbation sample quality as compared to state-of-the-art baselines such as Tilia, US-LIME, GLIME and BayesLIME.
Paper Structure (14 sections, 7 theorems, 77 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 14 sections, 7 theorems, 77 equations, 10 figures, 3 tables, 1 algorithm.

Key Result

theorem 1

Consider a Bayesian linear surrogate model given in eq:bayesian_surrogate and the posterior covariance in eq:covarianceBLR. Let the EAGLE-based acquisition function be defined as in eq:EAGLEentropydef. Then, under single-step greedy acquisition, maximizing $\mathcal{A}_{\mathrm{E}}(\mathbf z)$ over

Figures (10)

  • Figure 1: Perturbation strategies on the make_moons dataset ($n = 100$). Existing methods either sample in the vicinity of the instance disregarding the regions of epistemic uncertainty (LIME, BayLIME), are constrained excessively close the instance of interest (GLIME), or fail to adapt to the locality and only focus on predictive variance (Focus Sampling/BayesLIME). In contrast, EAGLE (ours) selects perturbations that respect both locality and regions of high epistemic uncertainty while covering both sides of the decision boundary, resulting in a compact and informative neighborhood.
  • Figure 2: Active sampling convergence behaviour: EAGLE consistently improves both $D$-efficiency and cumulative information gain, leading to higher AUCC across datasets.
  • Figure 3: Hyperparameter sensitivity of EAGLE, reporting D-efficiency (solid) and CCM (dashed). (a) Prior precision $\lambda/d$: the default $\lambda{=}d$ balances posterior concentration and explanation consistency. (b) Pool size $|\mathcal{P}|$: EAGLE maintains strong performance across all pool sizes, requiring no careful tuning. (c) Superpixel count: EAGLE adapts effectively to increasing dimensionality on both MNIST and ImageNet.
  • Figure 4: Convergence of A-efficiency as a function of perturbation budget across tabular datasets. EAGLE consistently achieves higher A-efficiency with fewer perturbations compared to BayesLIME and BayLIME, indicating more informative perturbation selection.
  • Figure 5: Empirical cumulative distribution functions (ECDF) of cumulative information gain (CIG) at $n{=}500$ across test instances. Curves further to the right indicate higher information gain. EAGLE consistently dominates the baselines across all datasets.
  • ...and 5 more figures

Theorems & Definitions (14)

  • theorem 1
  • lemma 1
  • theorem 2
  • corollary 1: Sample Complexity for $\ell_2$-Accuracy
  • proof
  • proof
  • proof
  • proof
  • lemma 2: Prior distribution of $\bm{\phi}$
  • proof
  • ...and 4 more