Table of Contents
Fetching ...

Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution

Jing Zhou, Chunlin Li

TL;DR

The paper tackles the challenge of assessing feature influence on the unconditional distribution of an outcome when using pretrained black-box predictors. It defines the feature-importance curve β(τ) through a distributional, Von Mises-type expansion and provides a post hoc plug-in estimator that leverages a pretrained predictor without retraining, augmented by density extrapolation for tail estimation. A sparsification mechanism via stepwise backward pruning across a grid of quantiles yields a sparse, interpretable set of features contributing to different parts of the distribution. Empirical results on synthetic and high-dimensional data demonstrate faithful, sparse β(τ) estimates and computational efficiency, with limitations primarily under heavy-tailed error distributions.

Abstract

Understanding how changes in explanatory features affect the unconditional distribution of the outcome is important in many applications. However, existing black-box predictive models are not readily suited for analyzing such questions. In this work, we develop an approximation method to compute the feature importance curves relevant to the unconditional distribution of outcomes, while leveraging the power of pre-trained black-box predictive models. The feature importance curves measure the changes across quantiles of outcome distribution given an external impact of change in the explanatory features. Through extensive numerical experiments and real data examples, we demonstrate that our approximation method produces sparse and faithful results, and is computationally efficient.

Leveraging Black-box Models to Assess Feature Importance in Unconditional Distribution

TL;DR

The paper tackles the challenge of assessing feature influence on the unconditional distribution of an outcome when using pretrained black-box predictors. It defines the feature-importance curve β(τ) through a distributional, Von Mises-type expansion and provides a post hoc plug-in estimator that leverages a pretrained predictor without retraining, augmented by density extrapolation for tail estimation. A sparsification mechanism via stepwise backward pruning across a grid of quantiles yields a sparse, interpretable set of features contributing to different parts of the distribution. Empirical results on synthetic and high-dimensional data demonstrate faithful, sparse β(τ) estimates and computational efficiency, with limitations primarily under heavy-tailed error distributions.

Abstract

Understanding how changes in explanatory features affect the unconditional distribution of the outcome is important in many applications. However, existing black-box predictive models are not readily suited for analyzing such questions. In this work, we develop an approximation method to compute the feature importance curves relevant to the unconditional distribution of outcomes, while leveraging the power of pre-trained black-box predictive models. The feature importance curves measure the changes across quantiles of outcome distribution given an external impact of change in the explanatory features. Through extensive numerical experiments and real data examples, we demonstrate that our approximation method produces sparse and faithful results, and is computationally efficient.

Paper Structure

This paper contains 18 sections, 1 theorem, 27 equations, 2 figures, 4 tables, 3 algorithms.

Key Result

Lemma 1

Let $G = (1 -t)F + t F'$ where $F$ and $F'$ are two distribution functions and $t \in [0, 1]$. The functional $\nu(G)$ around $F$ can be approximated as where the remainder term $o_\nu$ is of order $t$ and depends on the functional $\nu$, and $\hbox{IF}(x; \nu)$ refers to the influence function at $F$.

Figures (2)

  • Figure 1: Illustration of $\beta(\tau)$. Red points are the original distribution of $(X, Y)$. Blue points are the counterfactual distribution after a shift intervention on $X$. Grey line is $\mathbb{E}(Y\mid X)$. $\beta(0.2) > 0$ but $\beta(0.8)\approx 0$.
  • Figure 2: The percentage of out-of-range observations against quantile level $\tau$. The data are generated from $Y = 1 -2 X_1 + 5X_2 + \varepsilon$, where $\varepsilon \sim N(0,1)$, $X = (X_1, \ldots, X_4)^\top \sim N(\bm 0_4, \Sigma)$, and $\Sigma_{ij} = 0.5^{i - j}$.

Theorems & Definitions (2)

  • Definition 1: Influence function
  • Lemma 1: von Mises linear approximation