Table of Contents
Fetching ...

Disentangled Feature Importance

Jin-Hong Du, Kathryn Roeder, Larry Wasserman

Abstract

Feature importance (FI) measures are widely used to assess the contributions of predictors to an outcome, but they may target different notions of relevance. When predictors are correlated, traditional statistical FI methods are often tailored for feature selection and correlation can therefore be treated as conditional redundancy. By contrast, for model interpretation, FI is more naturally defined through marginal predictive relevance. In this context, we show that most existing approaches target identical population functionals under squared-error loss and exhibit correlation-induced bias. To address this limitation, we introduce Disentangled Feature Importance (DFI), a nonparametric generalization of the classical $R^2$ decomposition via canonical entropic optimal transport (EOT). DFI transforms correlated features into independent latent features using an EOT coupling for general covariate laws, including mixed and discrete settings. Importance scores are computed in this disentangled space and attributed back through the transition kernel's sensitivity. Under arbitrary feature dependencies, DFI provides a principled decomposition of latent importance scores that sum to the total predictive variability for latent additive models and to interaction-weighted functional ANOVA variances more generally. We develop semiparametric theory for DFI. Under the EOT formulation, we establish root-$n$ consistency and asymptotic normality for nondegenerate importance estimators in the latent space and the original feature space. Notably, our estimators achieve second-order estimation error, which vanishes if both regression function and EOT kernel estimation errors are $o_{\mathbb{P}}(n^{-1/4})$. By design, DFI avoids the computational burden of repeated submodel refitting and the challenges of conditional covariate distribution estimation, thereby achieving computational efficiency.

Disentangled Feature Importance

Abstract

Feature importance (FI) measures are widely used to assess the contributions of predictors to an outcome, but they may target different notions of relevance. When predictors are correlated, traditional statistical FI methods are often tailored for feature selection and correlation can therefore be treated as conditional redundancy. By contrast, for model interpretation, FI is more naturally defined through marginal predictive relevance. In this context, we show that most existing approaches target identical population functionals under squared-error loss and exhibit correlation-induced bias. To address this limitation, we introduce Disentangled Feature Importance (DFI), a nonparametric generalization of the classical decomposition via canonical entropic optimal transport (EOT). DFI transforms correlated features into independent latent features using an EOT coupling for general covariate laws, including mixed and discrete settings. Importance scores are computed in this disentangled space and attributed back through the transition kernel's sensitivity. Under arbitrary feature dependencies, DFI provides a principled decomposition of latent importance scores that sum to the total predictive variability for latent additive models and to interaction-weighted functional ANOVA variances more generally. We develop semiparametric theory for DFI. Under the EOT formulation, we establish root- consistency and asymptotic normality for nondegenerate importance estimators in the latent space and the original feature space. Notably, our estimators achieve second-order estimation error, which vanishes if both regression function and EOT kernel estimation errors are . By design, DFI avoids the computational burden of repeated submodel refitting and the challenges of conditional covariate distribution estimation, thereby achieving computational efficiency.

Paper Structure

This paper contains 75 sections, 26 theorems, 294 equations, 9 figures, 2 tables.

Key Result

lemma 2.1

Under the $\ell_2$ loss $\ell(\widehat{y},y)=(\widehat{y}-y)^2$, $\psi^{\textsc{cpi}}_{X_j}=\psi^{\textsc{loco}}_{X_j}=0$ if and only if $\mu(X)=\mu_{-j}(X_{-j})$ almost surely.

Figures (9)

  • Figure 1: Simulation results under \ref{['M1']}. (a) weak correlation $(\rho = 0.2)$; (b) strong correlation $(\rho = 0.8)$. For every method, the number shown in parentheses is the total importance, which sums up to the signal variance $\mathbb{V}\!\bigl[\mathbb{E}[Y \mid X]\bigr]=25$ when the covariates are independent $(\rho = 0)$. Bars give the mean over $100$ random seeds, and the error bars indicate the corresponding standard deviations. The heatmap on the right visualizes the weights $(\Sigma^{\frac{1}{2}})_{jl}^{\,2}$ that transfer importance from the latent coordinates to the observed features.
  • Figure 2: Simulation results for model \ref{['M2']}. (a) weak correlation ($\rho = 0.2$); (b) strong correlation ($\rho = 0.8$). The interpretation of the bars, error bars, and heat-map is identical to \ref{['fig:simu-1']}. The number in parentheses is the total estimated importance, which should be close to the signal variance $\mathbb{V}\!\bigl[\mathbb{E}[Y\mid X]\bigr] = 25 + 25\mathrm{e}^{-2} - 100\mathrm{e}^{-1} + 50\mathrm{e}^{-1}\cosh(\rho)$ (which is $\approx 10$ when $\rho = 0$, and $\approx 20$ when $\rho = 1$).
  • Figure 3: Simulation results for model \ref{['M3']}. (a) weak correlation ($\rho = 0.2$); (b) strong correlation ($\rho = 0.8$). The interpretation of the bars, error bars, and heat-map is identical to \ref{['fig:simu-1']}.
  • Figure 4: Computational time of different feature importance measures in the logarithmic scale. For DFI, the computational time includes the estimation of transport plans and importance scores.
  • Figure 5: Feature importance of antibody against HIV-1 infection. DFI with EOT ($\varepsilon=10^{-3}$) is used to compute the group feature importance scores. (a) The bars show the point estimate, while the error bars indicate the values within one estimated standard deviation. (b) The corresponding Z-score for one-sided tests of $\mathcal{H}_{0l}:\phi_{X_l}(\mathbb{P})\leq 0$. (c) Feature group description. Stars ${\color{orange}\ast}$ and ${\color{red}\ast}$ denote importance deemed statistically significantly different from zero at the 0.05 and 0.0036 (0.05/14) levels, respectively.
  • ...and 4 more figures

Theorems & Definitions (63)

  • example 1: Linear regression with correlated covariates
  • example 2: Genomics and systems biology
  • example 3: Natural language processing
  • example 4: Causal inference
  • lemma 2.1: Null features and conditional mean independence
  • lemma 2.2: Equivalence under $\ell_2$ loss
  • example 5: Dependent features with a bijective mapping
  • example 6: Dependent features with an injective mapping
  • proposition 3.1: Free of correlation distortion
  • Example 5: \ref{['ex:perfect-dependent']}
  • ...and 53 more