ExplainReduce: Summarising local explanations via proxies
Lauri Seppäläinen, Mudong Guo, Kai Puolamäki
TL;DR
ExplainReduce tackles the instability and redundancy of local explanations by aggregating them into a compact proxy set that globally explains a closed-box model. It formalises the reduction as optimization with coverage and fidelity objectives and demonstrates that greedy reduction can yield proxy sets that rival or exceed the full set in fidelity, while offering greater interpretability. The approach is agnostic to the underlying closed-box model and the local explanation method, and it generalises from limited initial explanations, enabling scalable, globally interpretable AI explanations. This has practical impact for delivering succinct, stable, and faithful global explanations in diverse domains, including synthetic benchmarks and scientific datasets like particle jets.
Abstract
Most commonly used non-linear machine learning methods are closed-box models, uninterpretable to humans. The field of explainable artificial intelligence (XAI) aims to develop tools to examine the inner workings of these closed boxes. An often-used model-agnostic approach to XAI involves using simple models as local approximations to produce so-called local explanations; examples of this approach include LIME, SHAP, and SLISEMAP. This paper shows how a large set of local explanations can be reduced to a small "proxy set" of simple models, which can act as a generative global explanation. This reduction procedure, ExplainReduce, can be formulated as an optimisation problem and approximated efficiently using greedy heuristics.
