Table of Contents
Fetching ...

Towards Attributions of Input Variables in a Coalition

Xinhao Zheng, Huiqi Deng, Quanshi Zhang

TL;DR

This paper analyzes the numerical effects of AND-OR interactions in AI models and extends the Shapley value to a new attribution metric for variable coalitions, revealing that specific interactions cause attribution conflicts.

Abstract

This paper focuses on the fundamental challenge of partitioning input variables in attribution methods for Explainable AI, particularly in Shapley value-based approaches. Previous methods always compute attributions given a predefined partition but lack theoretical guidance on how to form meaningful variable partitions. We identify that attribution conflicts arise when the attribution of a coalition differs from the sum of its individual variables' attributions. To address this, we analyze the numerical effects of AND-OR interactions in AI models and extend the Shapley value to a new attribution metric for variable coalitions. Our theoretical findings reveal that specific interactions cause attribution conflicts, and we propose three metrics to evaluate coalition faithfulness. Experiments on synthetic data, NLP, image classification, and the game of Go validate our approach, demonstrating consistency with human intuition and practical applicability.

Towards Attributions of Input Variables in a Coalition

TL;DR

This paper analyzes the numerical effects of AND-OR interactions in AI models and extends the Shapley value to a new attribution metric for variable coalitions, revealing that specific interactions cause attribution conflicts.

Abstract

This paper focuses on the fundamental challenge of partitioning input variables in attribution methods for Explainable AI, particularly in Shapley value-based approaches. Previous methods always compute attributions given a predefined partition but lack theoretical guidance on how to form meaningful variable partitions. We identify that attribution conflicts arise when the attribution of a coalition differs from the sum of its individual variables' attributions. To address this, we analyze the numerical effects of AND-OR interactions in AI models and extend the Shapley value to a new attribution metric for variable coalitions. Our theoretical findings reveal that specific interactions cause attribution conflicts, and we propose three metrics to evaluate coalition faithfulness. Experiments on synthetic data, NLP, image classification, and the game of Go validate our approach, demonstrating consistency with human intuition and practical applicability.
Paper Structure (28 sections, 7 theorems, 33 equations, 7 figures, 5 tables)

This paper contains 28 sections, 7 theorems, 33 equations, 7 figures, 5 tables.

Key Result

Theorem 3.2

(Reformulation of the Shapley value, proved in Appendix proof of theorem 2) The Shapley value $\phi(i)$ of each input variable $x_i$ can be explained as $\phi(i)=\sum_{S\subseteq N,i\in S}\frac{1}{|S|}\left[I_\text{and}(S)+I_\text{or}(S)\right]$.

Figures (7)

  • Figure 1: (a)AND-OR interaction: Let the AI model encode three interactions $S_1 = \{x_1, x_2\}$, $S_2=\{x_1, x_2,$$x_3, x_4, x_5, x_6\}$ and $S_3 = \{x_5, x_6\}$, respectively. In this way, the Shapley value of $x_1$ can be decomposed as $\phi(x_1)=1/2\cdot I(S_1) + 1/6\cdot I(S_2)$. (b) Conflict of attributions: Let us consider another example with three interactions, w.r.t., $S_1=\{x_1,x_2,x_3,x_4\}$, $S_2=\{x_1,x_2\}$, and $S_3=\{x_2,x_3,x_4\}$. The attribution of the coalition $\{x_1,x_2\}$ is not equal to the sum of attributions of input variable $x_1$ and $x_2$, i.e., $\varphi(S=\{x_1,x_2\})\neq \phi(x_1)+\phi(x_2)$.
  • Figure 2: Visualization of two approaches for the selection of coalitions in KataGo. For a coalition $S$, $\varphi(S)>0$ means the coalition $S$ of stones makes a positive numerical effect for the white, while it makes a negative effect when $\varphi(S)<0$.
  • Figure 3: Analysis of shape patterns in Go compared to human intuition
  • Figure 4: Coalition attribution faithfulness metrics of VGG-11 on CIFAR-10 dataset
  • Figure 5: Coalition attribution faithfulness metrics of ResNet-20 on CIFAR-10 dataset
  • ...and 2 more figures

Theorems & Definitions (18)

  • Definition 3.1
  • Theorem 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Corollary 3.5
  • Theorem 3.6
  • Corollary 3.7
  • Corollary 3.8
  • proof
  • proof
  • ...and 8 more