Table of Contents
Fetching ...

Explaining Image Classifiers

Hana Chockler, Joseph Y. Halpern

TL;DR

This paper addresses how to explain image classifiers through the Halpern-Pearl causal framework, critiquing MMTS for misaligning with Halpern's definitions and proposing to use the full causal definition to handle absence and rare events. It models classifiers as probabilistic causal systems and analyzes how MMTS's two-layer, pixel-only dependencies relate to Halpern's actual causation and explanation concepts, including the role of context sets K and goodness measures. The authors argue that Halpern's definition can subsume MMTS's insights while providing richer explanations, and they discuss extending explanations to negative outcomes and rare events with domain knowledge to improve tractability. The work aims to improve the theoretical grounding and practical quality of image-classifier explanations, enabling robust, domain-informed interpretations beyond positive-label predictions and standard feature attributions.

Abstract

We focus on explaining image classifiers, taking the work of Mothilal et al. [2021] (MMTS) as our point of departure. We observe that, although MMTS claim to be using the definition of explanation proposed by Halpern [2016], they do not quite do so. Roughly speaking, Halpern's definition has a necessity clause and a sufficiency clause. MMTS replace the necessity clause by a requirement that, as we show, implies it. Halpern's definition also allows agents to restrict the set of options considered. While these difference may seem minor, as we show, they can have a nontrivial impact on explanations. We also show that, essentially without change, Halpern's definition can handle two issues that have proved difficult for other approaches: explanations of absence (when, for example, an image classifier for tumors outputs "no tumor") and explanations of rare events (such as tumors).

Explaining Image Classifiers

TL;DR

This paper addresses how to explain image classifiers through the Halpern-Pearl causal framework, critiquing MMTS for misaligning with Halpern's definitions and proposing to use the full causal definition to handle absence and rare events. It models classifiers as probabilistic causal systems and analyzes how MMTS's two-layer, pixel-only dependencies relate to Halpern's actual causation and explanation concepts, including the role of context sets K and goodness measures. The authors argue that Halpern's definition can subsume MMTS's insights while providing richer explanations, and they discuss extending explanations to negative outcomes and rare events with domain knowledge to improve tractability. The work aims to improve the theoretical grounding and practical quality of image-classifier explanations, enabling robust, domain-informed interpretations beyond positive-label predictions and standard feature attributions.

Abstract

We focus on explaining image classifiers, taking the work of Mothilal et al. [2021] (MMTS) as our point of departure. We observe that, although MMTS claim to be using the definition of explanation proposed by Halpern [2016], they do not quite do so. Roughly speaking, Halpern's definition has a necessity clause and a sufficiency clause. MMTS replace the necessity clause by a requirement that, as we show, implies it. Halpern's definition also allows agents to restrict the set of options considered. While these difference may seem minor, as we show, they can have a nontrivial impact on explanations. We also show that, essentially without change, Halpern's definition can handle two issues that have proved difficult for other approaches: explanations of absence (when, for example, an image classifier for tumors outputs "no tumor") and explanations of rare events (such as tumors).
Paper Structure (4 sections, 2 theorems)

This paper contains 4 sections, 2 theorems.

Key Result

Theorem 1

Given a set $\vec{X}'$ of endogenous variables in a causal model $M$ such that (a) the variables in $\vec{X}'$ are causally independent, (b) $\vec{X}'$ is determined by the context, (c) $\vec{X}'$ includes all the parents of the variables in $\varphi$, (d) there is some setting $\vec{x}'$ of the var

Theorems & Definitions (12)

  • Definition 1
  • Definition 2
  • Example 1
  • Definition 3
  • Definition 4
  • Theorem 1
  • Definition 5
  • Definition 6
  • Theorem 2
  • Example 2
  • ...and 2 more