Table of Contents
Fetching ...

Unifying Feature-Based Explanations with Functional ANOVA and Cooperative Game Theory

Fabian Fumagalli, Maximilian Muschalik, Eyke Hüllermeier, Barbara Hammer, Julia Herbinger

TL;DR

This work introduces a unified framework for local and global feature-based explanations using two well-established concepts: functional ANOVA (fANOVA) from statistics, and the notion of value and interaction from cooperative game theory.

Abstract

Feature-based explanations, using perturbations or gradients, are a prevalent tool to understand decisions of black box machine learning models. Yet, differences between these methods still remain mostly unknown, which limits their applicability for practitioners. In this work, we introduce a unified framework for local and global feature-based explanations using two well-established concepts: functional ANOVA (fANOVA) from statistics, and the notion of value and interaction from cooperative game theory. We introduce three fANOVA decompositions that determine the influence of feature distributions, and use game-theoretic measures, such as the Shapley value and interactions, to specify the influence of higher-order interactions. Our framework combines these two dimensions to uncover similarities and differences between a wide range of explanation techniques for features and groups of features. We then empirically showcase the usefulness of our framework on synthetic and real-world datasets.

Unifying Feature-Based Explanations with Functional ANOVA and Cooperative Game Theory

TL;DR

This work introduces a unified framework for local and global feature-based explanations using two well-established concepts: functional ANOVA (fANOVA) from statistics, and the notion of value and interaction from cooperative game theory.

Abstract

Feature-based explanations, using perturbations or gradients, are a prevalent tool to understand decisions of black box machine learning models. Yet, differences between these methods still remain mostly unknown, which limits their applicability for practitioners. In this work, we introduce a unified framework for local and global feature-based explanations using two well-established concepts: functional ANOVA (fANOVA) from statistics, and the notion of value and interaction from cooperative game theory. We introduce three fANOVA decompositions that determine the influence of feature distributions, and use game-theoretic measures, such as the Shapley value and interactions, to specify the influence of higher-order interactions. Our framework combines these two dimensions to uncover similarities and differences between a wide range of explanation techniques for features and groups of features. We then empirically showcase the usefulness of our framework on synthetic and real-world datasets.

Paper Structure

This paper contains 117 sections, 7 theorems, 94 equations, 11 figures, 4 tables.

Key Result

Theorem 1

If $F$ is represented by its Taylor series expanded around $b$ for an instance $x_0$, then the effect $f_S^{(b)}$ is given by the generic effects Deng2024Unify

Figures (11)

  • Figure 1: Categorization of selected feature-based explanations with our framework: Local, and global methods (risk, sensitivity) by color, and explanation types individual ($\bigcirc$) and joint ($\square$) influence, as well as interactions ($\triangle$) by symbol. Each imputation method (b/m/c) corresponds to a fANOVA decomposition with increasing influence of feature distributions, whereas pure, partial and full effects are increasingly influenced by higher-order interactions.
  • Figure 2: Local explanations for the instance $x = (1,1,1,1)$ averaged over $30$ repetitions of varying random seeds (fluctuation is shown by error bars). Note that the model is correctly specified as $F(x) = {\color{myblue}2x_1} + {\color{mypink}2x_2} + {\color{myorange}2x_3} + {\color{myblue}x_1} {\color{mypink}x_2} + {\color{myblue}x_1} {\color{mypink}x_2} {\color{myorange}x_3}$.
  • Figure 3: Pure (a) and full (b) individual effects and two-way interactions for the global sensitivity game of an XGBoost model trained on California housing. Blue and red colors denote a reduction and increase of variance.
  • Figure 4: Individual (left) and second order partial interaction (2-SV, right) effects for the local explanation game of a sentiment analysis language model. While "but" has a negative pure effect, the partial and full effects are positive, indicating positive higher-order interactions. In fact, ("bad","but"), and ("but","enjoyable") have a strong positive partial interaction effect.
  • Figure 5: Individual local explanations for the instance $x = (1,1,1,1)$ and the linear model (left) with the ground-truth functional relationship $F_{\text{lin}}(x) = 2x_1 + 2 x_2 + 2 x_3$ and the linear model including interactions (right) with the ground-truth functional relationship $F_{\text{int}}(x) = 2x_1 + 2 x_2 + 2 x_3 + x_1x_2 + x_1x_2x_3$ averaged over $30$ repetitions of varying random seeds (fluctuation is shown by error bars).
  • ...and 6 more figures

Theorems & Definitions (21)

  • Definition 1
  • Theorem 1
  • Definition 2
  • Definition 3
  • Theorem 2
  • Definition 4
  • Corollary 1
  • Remark 1
  • Definition 5
  • Definition 6
  • ...and 11 more