Efficient Contrastive Explanations on Demand
Yacine Izza, Joao Marques-Silva
TL;DR
This work addresses the challenge of producing rigorous, distance-restricted, symbolic explanations for ML models with many features, linking explainability to adversarial robustness. It introduces formal definitions for distance-restricted abductive and contrastive explanations, alongside a duality framework based on minimal hitting sets, and proposes scalable algorithms including the parallel SwiftCXp for efficient computation. The paper also provides methods to enumerate explanations (MARCO-like) and compute smallest CXps via MaxSMT/MaxSAT and abstraction refinement, with empirical validation on image-classification benchmarks showing superior performance and scalability. The approach enables precise, local explanations and feature-attribution insights suitable for debugging, trust, and robustness analyses in large neural networks.
Abstract
Recent work revealed a tight connection between adversarial robustness and restricted forms of symbolic explanations, namely distance-based (formal) explanations. This connection is significant because it represents a first step towards making the computation of symbolic explanations as efficient as deciding the existence of adversarial examples, especially for highly complex machine learning (ML) models. However, a major performance bottleneck remains, because of the very large number of features that ML models may possess, in particular for deep neural networks. This paper proposes novel algorithms to compute the so-called contrastive explanations for ML models with a large number of features, by leveraging on adversarial robustness. Furthermore, the paper also proposes novel algorithms for listing explanations and finding smallest contrastive explanations. The experimental results demonstrate the performance gains achieved by the novel algorithms proposed in this paper.
