Approximation of group explainers with coalition structure using Monte Carlo sampling on the product space of coalitions and features
Konstandinos Kotsiopoulos, Alexey Miroshnikov, Khashayar Filom, Arjun Ravi Kannan
TL;DR
This work tackles the computational bottleneck of game-theoretic explanations by introducing a Monte Carlo framework that estimates marginal, quotient, and coalitional explainers via sampling on appropriate product spaces. By recasting explainers as expectations and leveraging carefully designed sampling measures, the authors achieve estimators with convergence guarantees and error bounds, while reducing complexity to scale linearly with the background dataset size $|ar{D}_X|$. The methodology covers linear and coalitional game values, including two-step Shapley and Owen-type values, and provides accelerated MC variants with precomputations to further speed up inference. Empirical results on synthetic data corroborate the theoretical convergence rates and demonstrate robustness to the number of predictors, supporting practical applicability to model-agnostic explanations in regulated settings.
Abstract
In recent years, many Machine Learning (ML) explanation techniques have been designed using ideas from cooperative game theory. These game-theoretic explainers suffer from high complexity, hindering their exact computation in practical settings. In our work, we focus on a wide class of linear game values, as well as coalitional values, for the marginal game based on a given ML model and predictor vector. By viewing these explainers as expectations over appropriate sample spaces, we design a novel Monte Carlo sampling algorithm that estimates them at a reduced complexity that depends linearly on the size of the background dataset. We set up a rigorous framework for the statistical analysis and obtain error bounds for our sampling methods. The advantage of this approach is that it is fast, easily implementable, and model-agnostic. Furthermore, it has similar statistical accuracy as other known estimation techniques that are more complex and model-specific. We provide rigorous proofs of statistical convergence, as well as numerical experiments whose results agree with our theoretical findings.
