Unifying Formal Explanations: A Complexity-Theoretic Perspective
Shahaf Bassan, Xuanxiang Huang, Guy Katz
TL;DR
This work presents a unified, complexity-theoretic framework for computing ML explanations by recasting diverse local/global, probabilistic/non-probabilistic, and sufficient/contrastive notions as a single minimal subset selection problem over a value function $v$. It identifies that global explanations exhibit monotonicity, submodularity, and/or supermodularity under natural distribution assumptions, enabling polynomial-time computation and provable approximation guarantees across neural networks, decision trees, and tree ensembles; in contrast, local explanations remain NP-hard even in simplified settings. The authors introduce greedy algorithms and submodular-set-cover-inspired approaches to obtain subset-minimal and cardinally minimal explanations, with constant-factor approximations for the global case under empirical distributions and feature independence, while showing strong inapproximability for the local case. These results clarify when provable guarantees are attainable in XAI and outline practical directions for computing reliable explanations in real-world models. The findings offer a principled bridge between combinatorial optimization and formal XAI, with implications for efficient, interpretable ML in diverse domains. All results are anchored in a unified probabilistic framework and extended across decision trees, neural networks, and tree ensembles, highlighting both tractable frontiers and fundamental hardness.
Abstract
Previous work has explored the computational complexity of deriving two fundamental types of explanations for ML model predictions: (1) *sufficient reasons*, which are subsets of input features that, when fixed, determine a prediction, and (2) *contrastive reasons*, which are subsets of input features that, when modified, alter a prediction. Prior studies have examined these explanations in different contexts, such as non-probabilistic versus probabilistic frameworks and local versus global settings. In this study, we introduce a unified framework for analyzing these explanations, demonstrating that they can all be characterized through the minimization of a unified probabilistic value function. We then prove that the complexity of these computations is influenced by three key properties of the value function: (1) *monotonicity*, (2) *submodularity*, and (3) *supermodularity* - which are three fundamental properties in *combinatorial optimization*. Our findings uncover some counterintuitive results regarding the nature of these properties within the explanation settings examined. For instance, although the *local* value functions do not exhibit monotonicity or submodularity/supermodularity whatsoever, we demonstrate that the *global* value functions do possess these properties. This distinction enables us to prove a series of novel polynomial-time results for computing various explanations with provable guarantees in the global explainability setting, across a range of ML models that span the interpretability spectrum, such as neural networks, decision trees, and tree ensembles. In contrast, we show that even highly simplified versions of these explanations become NP-hard to compute in the corresponding local explainability setting.
