Conditional Feature Importance revisited: Double Robustness, Efficiency and Inference
Angel Reyero-Lobo, Pierre Neuvial, Bertrand Thirion
TL;DR
This work provides a theoretical and empirical consolidation of Conditional Feature Importance (CFI), showing that Conditional Permutation Importance (CPI) is a valid CFI under proper conditional sampling. It reveals a double robustness property that aids variable selection, links CPI to the Total Sobol Index through Sobol-CPI (SCPI) for nonparametric efficiency, and offers bias corrections and valid inference procedures. The Sobol-CPI framework achieves asymptotic efficiency and consistent type-I error control, while experiments illustrate improved null-detection and competitive power without excessive computational cost. Overall, the paper strengthens the theoretical foundations of CFI and delivers practical tools for reliable variable importance assessment and inference.
Abstract
Conditional Feature Importance (CFI) was introduced long ago to account for the relationship between the studied feature and the rest of the input. However, CFI has not yet been studied from a theoretical perspective because the conditional sampling step has generally been overlooked. In this article, we demonstrate that the recent Conditional Permutation Importance (CPI) is indeed a valid implementation of this concept. Under the conditional null hypothesis, we then establish a double robustness property that can be leveraged for variable selection: with either a valid model or a valid conditional sampler, the method correctly identifies null coordinates. Under the alternative hypothesis, we study the theoretical target and link it to the popular Total Sobol Index (TSI). We introduce the Sobol-CPI, which generalizes CPI/CFI, prove that it is nonparametrically efficient, and provide a bias correction. Finally, we propose a consistent and valid type-I error test and present numerical experiments that illustrate our findings.
