A general framework for inference on algorithm-agnostic variable importance
Brian D. Williamson, Peter B. Gilbert, Noah R. Simon, Marco Carone
TL;DR
This work delivers a unified, model-agnostic framework to quantify intrinsic variable importance via a population-level contrast in oracle predictiveness, $\\psi_{0,s}=V(f_0,P_0)-V(f_{0,s},P_0)$, independent of any single prediction algorithm. It develops nonparametric, efficient plug-in estimators for a broad class of predictiveness measures, demonstrates asymptotic efficiency under regularity, and introduces cross-fitting to accommodate flexible machine-learning-based nuisance estimation. The authors further provide strategies for valid inference under zero-importance and extend the approach to complex settings, including causal and missing-data scenarios. Through simulations, they show favorable operating characteristics, and they illustrate the framework by analyzing HIV-1 antibody resistance features, revealing which feature groups most drive predictive performance. Overall, the framework enables robust, algorithm-agnostic assessment of variable importance with concrete guidance for inference and practical deployment in high-stakes scientific studies.
Abstract
In many applications, it is of interest to assess the relative contribution of features (or subsets of features) toward the goal of predicting a response -- in other words, to gauge the variable importance of features. Most recent work on variable importance assessment has focused on describing the importance of features within the confines of a given prediction algorithm. However, such assessment does not necessarily characterize the prediction potential of features, and may provide a misleading reflection of the intrinsic value of these features. To address this limitation, we propose a general framework for nonparametric inference on interpretable algorithm-agnostic variable importance. We define variable importance as a population-level contrast between the oracle predictiveness of all available features versus all features except those under consideration. We propose a nonparametric efficient estimation procedure that allows the construction of valid confidence intervals, even when machine learning techniques are used. We also outline a valid strategy for testing the null importance hypothesis. Through simulations, we show that our proposal has good operating characteristics, and we illustrate its use with data from a study of an antibody against HIV-1 infection.
