META-ANOVA: Screening interactions for interpretable machine learning
Yongchan Choi, Seokhun Park, Chanmoo Park, Dongha Kim, Yongdai Kim
TL;DR
Meta-ANOVA addresses the interpretability gap of high-performing black-box models by transforming any predictor into a functional ANOVA surrogate and introducing a statistically justified interaction-screening procedure. A key innovation is the I(j) screening score, which enables inclusion of higher-order interactions without prohibitive computation, and a two-step pipeline that first prunes interactions and then learns the surrogate with the selected terms using NAM/NIM. The approach is proven asymptotically consistent and empirically demonstrates near-parity in predictive accuracy with substantially improved interpretability and post-hoc explanations (ANOVA-SHAP) across synthetic and real datasets, including large-scale models and cross-domain tasks. This method offers a model-agnostic, scalable option for transparent AI, enabling reliable interpretation in high-stakes applications while preserving strong predictive performance.
Abstract
There are two things to be considered when we evaluate predictive models. One is prediction accuracy,and the other is interpretability. Over the recent decades, many prediction models of high performance, such as ensemble-based models and deep neural networks, have been developed. However, these models are often too complex, making it difficult to intuitively interpret their predictions. This complexity in interpretation limits their use in many real-world fields that require accountability, such as medicine, finance, and college admissions. In this study, we develop a novel method called Meta-ANOVA to provide an interpretable model for any given prediction model. The basic idea of Meta-ANOVA is to transform a given black-box prediction model to the functional ANOVA model. A novel technical contribution of Meta-ANOVA is a procedure of screening out unnecessary interaction before transforming a given black-box model to the functional ANOVA model. This screening procedure allows the inclusion of higher order interactions in the transformed functional ANOVA model without computational difficulties. We prove that the screening procedure is asymptotically consistent. Through various experiments with synthetic and real-world datasets, we empirically demonstrate the superiority of Meta-ANOVA
