Table of Contents
Fetching ...

Approximation of group explainers with coalition structure using Monte Carlo sampling on the product space of coalitions and features

Konstandinos Kotsiopoulos, Alexey Miroshnikov, Khashayar Filom, Arjun Ravi Kannan

TL;DR

This work tackles the computational bottleneck of game-theoretic explanations by introducing a Monte Carlo framework that estimates marginal, quotient, and coalitional explainers via sampling on appropriate product spaces. By recasting explainers as expectations and leveraging carefully designed sampling measures, the authors achieve estimators with convergence guarantees and error bounds, while reducing complexity to scale linearly with the background dataset size $|ar{D}_X|$. The methodology covers linear and coalitional game values, including two-step Shapley and Owen-type values, and provides accelerated MC variants with precomputations to further speed up inference. Empirical results on synthetic data corroborate the theoretical convergence rates and demonstrate robustness to the number of predictors, supporting practical applicability to model-agnostic explanations in regulated settings.

Abstract

In recent years, many Machine Learning (ML) explanation techniques have been designed using ideas from cooperative game theory. These game-theoretic explainers suffer from high complexity, hindering their exact computation in practical settings. In our work, we focus on a wide class of linear game values, as well as coalitional values, for the marginal game based on a given ML model and predictor vector. By viewing these explainers as expectations over appropriate sample spaces, we design a novel Monte Carlo sampling algorithm that estimates them at a reduced complexity that depends linearly on the size of the background dataset. We set up a rigorous framework for the statistical analysis and obtain error bounds for our sampling methods. The advantage of this approach is that it is fast, easily implementable, and model-agnostic. Furthermore, it has similar statistical accuracy as other known estimation techniques that are more complex and model-specific. We provide rigorous proofs of statistical convergence, as well as numerical experiments whose results agree with our theoretical findings.

Approximation of group explainers with coalition structure using Monte Carlo sampling on the product space of coalitions and features

TL;DR

This work tackles the computational bottleneck of game-theoretic explanations by introducing a Monte Carlo framework that estimates marginal, quotient, and coalitional explainers via sampling on appropriate product spaces. By recasting explainers as expectations and leveraging carefully designed sampling measures, the authors achieve estimators with convergence guarantees and error bounds, while reducing complexity to scale linearly with the background dataset size . The methodology covers linear and coalitional game values, including two-step Shapley and Owen-type values, and provides accelerated MC variants with precomputations to further speed up inference. Empirical results on synthetic data corroborate the theoretical convergence rates and demonstrate robustness to the number of predictors, supporting practical applicability to model-agnostic explanations in regulated settings.

Abstract

In recent years, many Machine Learning (ML) explanation techniques have been designed using ideas from cooperative game theory. These game-theoretic explainers suffer from high complexity, hindering their exact computation in practical settings. In our work, we focus on a wide class of linear game values, as well as coalitional values, for the marginal game based on a given ML model and predictor vector. By viewing these explainers as expectations over appropriate sample spaces, we design a novel Monte Carlo sampling algorithm that estimates them at a reduced complexity that depends linearly on the size of the background dataset. We set up a rigorous framework for the statistical analysis and obtain error bounds for our sampling methods. The advantage of this approach is that it is fast, easily implementable, and model-agnostic. Furthermore, it has similar statistical accuracy as other known estimation techniques that are more complex and model-specific. We provide rigorous proofs of statistical convergence, as well as numerical experiments whose results agree with our theoretical findings.
Paper Structure (17 sections, 11 theorems, 46 equations, 6 figures, 8 algorithms)

This paper contains 17 sections, 11 theorems, 46 equations, 6 figures, 8 algorithms.

Key Result

Lemma 2.1

Let $f\in [f] \in L^1(P_X)$. Then, for $P_X$-almost sure $x^* \in \mathbb{R}^n$ the conditional game $v^{ \text{\tiny \it CE}}(\cdot;x^*,X,f)$ is well-defined for any subset of $N$.

Figures (6)

  • Figure 1: MC error estimate of the empirical marginal quotient Shapley for $S_1$.
  • Figure 2: MC error estimate of the empirical marginal Shapley for increasing number of predictors.
  • Figure 3: MC error estimate of the empirical marginal Owen value for $X_4$.
  • Figure 4: MC error estimate of the empirical marginal Owen value as the number of predictors increases.
  • Figure 5: MC error estimate of the empirical marginal two-step Shapley for $X_4$.
  • ...and 1 more figures

Theorems & Definitions (46)

  • Lemma 2.1
  • proof
  • Lemma 2.2
  • proof
  • Corollary 2.1
  • Lemma 2.3
  • proof
  • Definition 2.1
  • Definition 2.2
  • Remark 2.1
  • ...and 36 more