Succinct Interaction-Aware Explanations
Sascha Xu, Joscha Cüppers, Jilles Vreeken
TL;DR
This work addresses SHAP's inability to capture feature interactions by proposing iShap, a partition-based extension that identifies interacting feature blocks and explains a model additively over those blocks. It formalizes a regularized objective that balances reconstruction fidelity $\left( f(x) - \sum_{S \in \Pi} v(S) \right)^2$ with a sparsity-like penalty on block size, and uses a statistical pairwise interaction test to prune the search space, guiding the partitioning to meaningful coalitions ${\Pi}^*$. After partitioning, iShap computes Shapley values on the reduced game, yielding succinct yet informative explanations that reveal significant interactions without enumerating all subsets. Empirically, iShap improves interaction discovery, surrogate-model accuracy, and interpretability on synthetic GAMs and real data, including a Covid-19 patient case, while offering scalable exact and greedy variants for practical use.
Abstract
SHAP is a popular approach to explain black-box models by revealing the importance of individual features. As it ignores feature interactions, SHAP explanations can be confusing up to misleading. NSHAP, on the other hand, reports the additive importance for all subsets of features. While this does include all interacting sets of features, it also leads to an exponentially sized, difficult to interpret explanation. In this paper, we propose to combine the best of these two worlds, by partitioning the features into parts that significantly interact, and use these parts to compose a succinct, interpretable, additive explanation. We derive a criterion by which to measure the representativeness of such a partition for a models behavior, traded off against the complexity of the resulting explanation. To efficiently find the best partition out of super-exponentially many, we show how to prune sub-optimal solutions using a statistical test, which not only improves runtime but also helps to detect spurious interactions. Experiments on synthetic and real world data show that our explanations are both more accurate resp. more easily interpretable than those of SHAP and NSHAP.
