ORACLE: Explaining Feature Interactions in Neural Networks with ANOVA
Dongseok Kim, Hyoungsun Choi, Mohamed Jismy Aashik Rasool, Gisung Oh
TL;DR
ORACLE reframes neural network explanations as an ANOVA-style surrogate learned on a discretized input grid to produce orthogonal main- and pairwise-interaction maps with $L^2$-consistency to the backbone. It defines target interaction strengths $S^{\star}_{jk}$ and provides a principled projection-based surrogate whose maps and strengths converge to the oracle as grid resolution and data increase. Empirically, ORACLE yields more faithful and stable interaction rankings and localization than SHAP-based baselines on synthetic and real tabular benchmarks, with demonstrated DoE-style interpretability and cross-backbone transfer; latent-domain results suggest limits in highly entangled representations. The work establishes a principled bridge between classical ANOVA/DoE and modern neural explanations, and outlines extensions to higher-order interactions, adaptive grids, and hybrid methods for broader model classes and applications.
Abstract
We introduce ORACLE, a framework that explains neural networks on tabular and scientific design data. It fits ANOVA-style main and pairwise interaction effects to a model's prediction surface. ORACLE treats a trained network as a black-box response, learns an orthogonal factorial surrogate on a discretized input grid, and uses simple centering and $μ$-rebalancing steps to obtain main- and interaction-effect tables that remain $L^2$-consistent with the original model. The resulting grid-based interaction maps are easy to visualize, comparable across backbones, and directly connected to classical design-of-experiments analyses. On synthetic factorial and low- to medium-dimensional tabular regression benchmarks, ORACLE more accurately recovers ground-truth ANOVA interactions and hotspot structure than Monte Carlo SHAP-family interaction methods, as measured by ranking, localization, and cross-backbone stability metrics. In latent image and text settings, ORACLE instead delineates its natural scope, and our results indicate that grid-based ANOVA surrogates are most effective when features admit interpretable factorial structure, making ORACLE particularly well-suited to scientific and engineering tabular workflows that require stable, DoE-style interaction summaries.
