Table of Contents
Fetching ...

Multiple Different Black Box Explanations for Image Classifiers

Hana Chockler, David A. Kelly, Daniel Kroening

TL;DR

The paper tackles the limitation of single explanations for image classifier decisions by introducing MultEX, a black-box method grounded in actual causality to output multiple, diverse explanations for a given image. It models classifiers as probabilistic causal networks with pixel-level endogenous variables, defines explanations as minimal pixel subsets that guarantee the top-class decision, and proves the problem’s intractability while offering a practical, plug-in algorithm (CAUSAL_RANK, searchlight-based exploration, minimize, and separate) to approximate multiple explanations. Across ImageNet-1k, VOC2012, and ECSSD with ResNet50, ConvNext, and ViT-B-32, MultEX produces more explanations that are smaller and consistently tied to the top classification, and it remains robust to probability-threshold settings unlike Sag. The approach yields disjoint, localized explanations even on occluded images, enhancing interpretability and debugging capabilities for modern CNNs and transformer-based vision models, with code and data openly available.

Abstract

Existing explanation tools for image classifiers usually give only a single explanation for an image's classification. For many images, however, image classifiers accept more than one explanation for the image label. These explanations are useful for analyzing the decision process of the classifier and for detecting errors. Thus, restricting the number of explanations to just one severely limits insight into the behavior of the classifier. In this paper, we describe an algorithm and a tool, MultEX, for computing multiple explanations as the output of a black-box image classifier for a given image. Our algorithm uses a principled approach based on actual causality. We analyze its theoretical complexity and evaluate MultEX against the state-of-the-art across three different models and three different datasets. We find that MultEX finds more explanations and that these explanations are of higher quality.

Multiple Different Black Box Explanations for Image Classifiers

TL;DR

The paper tackles the limitation of single explanations for image classifier decisions by introducing MultEX, a black-box method grounded in actual causality to output multiple, diverse explanations for a given image. It models classifiers as probabilistic causal networks with pixel-level endogenous variables, defines explanations as minimal pixel subsets that guarantee the top-class decision, and proves the problem’s intractability while offering a practical, plug-in algorithm (CAUSAL_RANK, searchlight-based exploration, minimize, and separate) to approximate multiple explanations. Across ImageNet-1k, VOC2012, and ECSSD with ResNet50, ConvNext, and ViT-B-32, MultEX produces more explanations that are smaller and consistently tied to the top classification, and it remains robust to probability-threshold settings unlike Sag. The approach yields disjoint, localized explanations even on occluded images, enhancing interpretability and debugging capabilities for modern CNNs and transformer-based vision models, with code and data openly available.

Abstract

Existing explanation tools for image classifiers usually give only a single explanation for an image's classification. For many images, however, image classifiers accept more than one explanation for the image label. These explanations are useful for analyzing the decision process of the classifier and for detecting errors. Thus, restricting the number of explanations to just one severely limits insight into the behavior of the classifier. In this paper, we describe an algorithm and a tool, MultEX, for computing multiple explanations as the output of a black-box image classifier for a given image. Our algorithm uses a principled approach based on actual causality. We analyze its theoretical complexity and evaluate MultEX against the state-of-the-art across three different models and three different datasets. We find that MultEX finds more explanations and that these explanations are of higher quality.
Paper Structure (9 sections, 2 theorems, 3 figures, 7 tables, 3 algorithms)

This paper contains 9 sections, 2 theorems, 3 figures, 7 tables, 3 algorithms.

Key Result

Lemma 1

Given an input image and one explanation, the decision problem of a different explanation is DP-complete.

Figures (3)

  • Figure 1: Imagenet class $752$: racket, according to ResNet50. \ref{['subfig:1', 'subfig:6', 'subfig:9']} show $3$ minimal, sufficient explanations for class $752$. Only \ref{['subfig:6', 'subfig:9']} contains part of the racket. The tennis players shorts are also classified as racket, with a higher confidence than either \ref{['subfig:9']} or \ref{['subfig:6']}.
  • Figure 2: A schematic depiction of MultEX, returning a set of explanations $\mathcal{E}$ for a given input image. Its components: ① ranking generates a responsibility landscape of pixels; ② search launches $x$searchlight searches over the landscape; ③ drain minimizes the explanations founds in ②; ④ separate produces a maximal subset $\mathcal{E}$ from the output of ③, with the given overlap bound.
  • Figure 3: \ref{['occ:one']} shows a disjoint explanation produced by MultEX for \ref{['occ:bus']}. MultEX also produces the explanation \ref{['occ:two']} which contains just a thin strip of the bus on the left hand side. While a smaller explanation, a human might prefer \ref{['occ:one']}. \ref{['occ:landscape']} shows the responsibility landscape MultEX produces. All explanations are resized to $224\times224$ as per model requirements. \ref{['occ:sag']} shows that Sag fails to find a disjoint explanation and included the man's shoulder.

Theorems & Definitions (7)

  • Definition 1
  • Definition 2
  • Definition 3: Explanation for image classification CKS21
  • Lemma 1
  • proof
  • Lemma 2
  • proof