Table of Contents
Fetching ...

Efficient Contrastive Explanations on Demand

Yacine Izza, Joao Marques-Silva

TL;DR

This work addresses the challenge of producing rigorous, distance-restricted, symbolic explanations for ML models with many features, linking explainability to adversarial robustness. It introduces formal definitions for distance-restricted abductive and contrastive explanations, alongside a duality framework based on minimal hitting sets, and proposes scalable algorithms including the parallel SwiftCXp for efficient computation. The paper also provides methods to enumerate explanations (MARCO-like) and compute smallest CXps via MaxSMT/MaxSAT and abstraction refinement, with empirical validation on image-classification benchmarks showing superior performance and scalability. The approach enables precise, local explanations and feature-attribution insights suitable for debugging, trust, and robustness analyses in large neural networks.

Abstract

Recent work revealed a tight connection between adversarial robustness and restricted forms of symbolic explanations, namely distance-based (formal) explanations. This connection is significant because it represents a first step towards making the computation of symbolic explanations as efficient as deciding the existence of adversarial examples, especially for highly complex machine learning (ML) models. However, a major performance bottleneck remains, because of the very large number of features that ML models may possess, in particular for deep neural networks. This paper proposes novel algorithms to compute the so-called contrastive explanations for ML models with a large number of features, by leveraging on adversarial robustness. Furthermore, the paper also proposes novel algorithms for listing explanations and finding smallest contrastive explanations. The experimental results demonstrate the performance gains achieved by the novel algorithms proposed in this paper.

Efficient Contrastive Explanations on Demand

TL;DR

This work addresses the challenge of producing rigorous, distance-restricted, symbolic explanations for ML models with many features, linking explainability to adversarial robustness. It introduces formal definitions for distance-restricted abductive and contrastive explanations, alongside a duality framework based on minimal hitting sets, and proposes scalable algorithms including the parallel SwiftCXp for efficient computation. The paper also provides methods to enumerate explanations (MARCO-like) and compute smallest CXps via MaxSMT/MaxSAT and abstraction refinement, with empirical validation on image-classification benchmarks showing superior performance and scalability. The approach enables precise, local explanations and feature-attribution insights suitable for debugging, trust, and robustness analyses in large neural networks.

Abstract

Recent work revealed a tight connection between adversarial robustness and restricted forms of symbolic explanations, namely distance-based (formal) explanations. This connection is significant because it represents a first step towards making the computation of symbolic explanations as efficient as deciding the existence of adversarial examples, especially for highly complex machine learning (ML) models. However, a major performance bottleneck remains, because of the very large number of features that ML models may possess, in particular for deep neural networks. This paper proposes novel algorithms to compute the so-called contrastive explanations for ML models with a large number of features, by leveraging on adversarial robustness. Furthermore, the paper also proposes novel algorithms for listing explanations and finding smallest contrastive explanations. The experimental results demonstrate the performance gains achieved by the novel algorithms proposed in this paper.

Paper Structure

This paper contains 32 sections, 2 theorems, 13 equations, 1 figure, 1 table, 5 algorithms.

Key Result

Proposition 1

Consider an explanation problem ${\mathcal{E}}=({\mathcal{M}},(\mathbf{v},c))$ and some $\epsilon>0$ for norm $l_{p}$. Let $\mathbf{x}\in\mathbb{F}$, with $\lVert\mathbf{x}-\mathbf{v}\rVert_p\le\epsilon$, and let ${\mathcal{D}}=\{i\in{\mathcal{F}}\,|\,x_i\not=v_i\}$. Then,

Figures (1)

  • Figure 1: Visualize Feature Attribution-based Contrastive explanations for image datasets: MNIST and GTSRB.

Theorems & Definitions (12)

  • Example 1
  • Example 2
  • Example 3
  • Definition 1: Distance-restricted (W)AXp, $\mathfrak{d}$(W)AXp
  • Definition 2: Distance-restricted (W)CXp, $\mathfrak{d}$(W)CXp
  • Example 4
  • Remark 1
  • Proposition 1
  • Proposition 2
  • Example 5
  • ...and 2 more