Table of Contents
Fetching ...

xplainfi: Feature Importance and Statistical Inference for Machine Learning in R

Lukas Burk, Fiona Katharina Ewald, Giuseppe Casalicchio, Marvin N. Wright, Bernd Bischl

Abstract

We introduce xplainfi, an R package built on top of the mlr3 ecosystem for global, loss-based feature importance methods for machine learning models. Various feature importance methods exist in R, but significant gaps remain, particularly regarding conditional importance methods and associated statistical inference procedures. The package implements permutation feature importance, conditional feature importance, relative feature importance, leave-one-covariate-out, and generalizations thereof, and both marginal and conditional Shapley additive global importance methods. It provides a modular conditional sampling architecture based on Gaussian distributions, adversarial random forests, conditional inference trees, and knockoff-based samplers, which enable conditional importance analysis for continuous and mixed data. Statistical inference is available through multiple approaches, including variance-corrected confidence intervals and the conditional predictive impact framework. We demonstrate that xplainfi produces importance scores consistent with existing implementations across multiple simulation settings and learner types, while offering competitive runtime performance. The package is available on CRAN and provides researchers and practitioners with a comprehensive toolkit for feature importance analysis and model interpretation in R.

xplainfi: Feature Importance and Statistical Inference for Machine Learning in R

Abstract

We introduce xplainfi, an R package built on top of the mlr3 ecosystem for global, loss-based feature importance methods for machine learning models. Various feature importance methods exist in R, but significant gaps remain, particularly regarding conditional importance methods and associated statistical inference procedures. The package implements permutation feature importance, conditional feature importance, relative feature importance, leave-one-covariate-out, and generalizations thereof, and both marginal and conditional Shapley additive global importance methods. It provides a modular conditional sampling architecture based on Gaussian distributions, adversarial random forests, conditional inference trees, and knockoff-based samplers, which enable conditional importance analysis for continuous and mixed data. Statistical inference is available through multiple approaches, including variance-corrected confidence intervals and the conditional predictive impact framework. We demonstrate that xplainfi produces importance scores consistent with existing implementations across multiple simulation settings and learner types, while offering competitive runtime performance. The package is available on CRAN and provides researchers and practitioners with a comprehensive toolkit for feature importance analysis and model interpretation in R.
Paper Structure (25 sections, 4 equations, 5 figures, 2 tables)

This paper contains 25 sections, 4 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Comparison of uncertainty quantification methods using PFI on the correlated task (omitting noise features). Empirical 95% quantiles fall between the very narrow unadjusted confidence intervals and the wider Nadeau-Bengio-corrected intervals, which show the uncertainty masked by the unadjusted method.
  • Figure 2: Importance scores (scaled to percentages) for PFI, CFI, mSAGE, and cSAGE across implementations on the correlated simulation setting with r = 0.75, based on either the linear model or the boosting learner.
  • Figure 3: Importance scores (scaled to percentages) for PFI and CFI across implementations on the bike sharing dataset, based on either the linear model or the boosting learner.
  • Figure 4: Importance scores (scaled to percentages) for mSAGE and cSAGE across implementations on the bike sharing dataset, based on either the linear model or the boosting learner.
  • Figure 5: Median runtime in seconds with 25% and 75% quantiles for PFI and CFI with n_repeats = 50 and mSAGE and cSAGE with n_samples = 100 and n_permutations = 100 across implementations on the peak simulation setting with 5, 10, and 20 features and 5000 samples using a linear model across 25 replications.