Table of Contents
Fetching ...

iLOCO: Distribution-Free Inference for Feature Interactions

Camille Little, Lili Zheng, Genevera Allen

TL;DR

This work introduces iLOCO, a model-agnostic, distribution-free metric to quantify pairwise and higher-order feature interactions and to provide calibrated uncertainty via confidence intervals. By framing iLOCO within a functional ANOVA decomposition, it interprets joint interaction contributions as sums of squared ANOVA components and extends to higher-order terms. The authors develop two scalable inference approaches, data-splitting and minipatch ensembles, with rigorous asymptotic validity under mild moment conditions, demonstrated through simulations and real-data case studies. The results offer the first practical, inferential tool for detecting and quantifying feature interactions with uncertainty quantification, enabling more reliable interpretation and decision-making in complex models.

Abstract

Feature importance measures are widely studied and are essential for understanding model behavior, guiding feature selection, and enhancing interpretability. However, many machine learning fitted models involve complex interactions between features. Existing feature importance metrics fail to capture these pairwise or higher-order effects, while existing interaction metrics often suffer from limited applicability or excessive computation; no methods exist to conduct statistical inference for feature interactions. To bridge this gap, we first propose a new model-agnostic metric, interaction Leave-One-Covariate-Out (iLOCO), for measuring the importance of pairwise feature interactions, with extensions to higher-order interactions. Next, we leverage recent advances in LOCO inference to develop distribution-free and assumption-light confidence intervals for our iLOCO metric. To address computational challenges, we also introduce an ensemble learning method for calculating the iLOCO metric and confidence intervals that we show is both computationally and statistically efficient. We validate our iLOCO metric and our confidence intervals on both synthetic and real data sets, showing that our approach outperforms existing methods and provides the first inferential approach to detecting feature interactions.

iLOCO: Distribution-Free Inference for Feature Interactions

TL;DR

This work introduces iLOCO, a model-agnostic, distribution-free metric to quantify pairwise and higher-order feature interactions and to provide calibrated uncertainty via confidence intervals. By framing iLOCO within a functional ANOVA decomposition, it interprets joint interaction contributions as sums of squared ANOVA components and extends to higher-order terms. The authors develop two scalable inference approaches, data-splitting and minipatch ensembles, with rigorous asymptotic validity under mild moment conditions, demonstrated through simulations and real-data case studies. The results offer the first practical, inferential tool for detecting and quantifying feature interactions with uncertainty quantification, enabling more reliable interpretation and decision-making in complex models.

Abstract

Feature importance measures are widely studied and are essential for understanding model behavior, guiding feature selection, and enhancing interpretability. However, many machine learning fitted models involve complex interactions between features. Existing feature importance metrics fail to capture these pairwise or higher-order effects, while existing interaction metrics often suffer from limited applicability or excessive computation; no methods exist to conduct statistical inference for feature interactions. To bridge this gap, we first propose a new model-agnostic metric, interaction Leave-One-Covariate-Out (iLOCO), for measuring the importance of pairwise feature interactions, with extensions to higher-order interactions. Next, we leverage recent advances in LOCO inference to develop distribution-free and assumption-light confidence intervals for our iLOCO metric. To address computational challenges, we also introduce an ensemble learning method for calculating the iLOCO metric and confidence intervals that we show is both computationally and statistically efficient. We validate our iLOCO metric and our confidence intervals on both synthetic and real data sets, showing that our approach outperforms existing methods and provides the first inferential approach to detecting feature interactions.

Paper Structure

This paper contains 25 sections, 8 theorems, 46 equations, 10 figures, 2 tables, 2 algorithms.

Key Result

Proposition 1

Suppose Assumption assump:anova holds. Then:

Figures (10)

  • Figure 1: Validation of iLOCO Metric. Part A shows the success probability of identifying an interaction pair in nonlinear classification scenarios (i) and (iii) using an MLP classifier. Part B presents the success probability of detecting an important, correlated feature pair across varying correlation strengths.
  • Figure 2: Comparative Evaluations. Success probability of detecting feature pair $(1,2)$ across SNR levels for KRBF, RF, and MLP classifiers on nonlinear classification simulations (i) and (ii).
  • Figure 3: Theory Validation. Coverage of 90% confidence intervals for a null feature pair in synthetic regression simulation (i) using KRBF, MLP, and RF as the base estimators.
  • Figure 4: iLOCO (computed via iLOCO-MP) marginal confidence intervals ($\alpha = 0.1$; adjusted for multiplicity via Bonferroni) on the Car Evaluation data (A) and Cocktail Recipe data (B). Interactions with confidence intervals that do not contain zero (blue dashed line) are statistically significant.
  • Figure A1: Part A shows the success probability of identifying an interaction pair in nonlinear classification scenarios (i) and (iii) using a RF classifier. Part B presents the success probability of detecting an important, correlated feature pair across varying correlation strengths.
  • ...and 5 more figures

Theorems & Definitions (16)

  • Definition 1
  • Proposition 1
  • Definition 2
  • Theorem 1: Coverage of iLOCO-Split
  • Theorem 2: Coverage of iLOCO-MP
  • Theorem 3: Coverage of iLOCO-Split
  • proof
  • Theorem 4: Coverage of iLOCO-MP
  • proof
  • Proposition 2
  • ...and 6 more