$φ$-test: Global Feature Selection and Inference for Shapley Additive Explanations
Dongseok Kim, Hyoungsun Choi, Mohamed Jismy Aashik Rasool, Gisung Oh
TL;DR
phi-test addresses the need for statistically principled global feature importance for black-box predictors by marrying SHAP-based screening with a linear surrogate and selective-inference-based post-selection uncertainty. The method yields a global feature-importance table that includes SHAP scores, surrogate coefficients, and selection-adjusted p-values and confidence intervals. Across real tabular datasets with tree-based and neural backbones, phi-test maintains predictive fidelity with a small, stable feature subset and demonstrates robustness across backbones. This work provides a practical bridge between SHAP explanations and classical statistical inference, enabling interpretable, uncertainty-aware global explanations.
Abstract
We propose $φ$-test, a global feature-selection and significance procedure for black-box predictors that combines Shapley attributions with selective inference. Given a trained model and an evaluation dataset, $φ$-test performs SHAP-guided screening and fits a linear surrogate on the screened features via a selection rule with a tractable selective-inference form. For each retained feature, it outputs a Shapley-based global score, a surrogate coefficient, and post-selection $p$-values and confidence intervals in a global feature-importance table. Experiments on real tabular regression tasks with tree-based and neural backbones suggest that $φ$-test can retain much of the predictive ability of the original model while using only a few features and producing feature sets that remain fairly stable across resamples and backbone classes. In these settings, $φ$-test acts as a practical global explanation layer linking Shapley-based importance summaries with classical statistical inference.
