Table of Contents
Fetching ...

Causal Explanations for Image Classifiers

Hana Chockler, David A. Kelly, Daniel Kroening, Youcheng Sun

TL;DR

It is demonstrated that ReX is the most efficient black-box tool and produces the smallest explanations, in addition to outperforming other black-box tools on standard quality measures.

Abstract

Existing algorithms for explaining the output of image classifiers use different definitions of explanations and a variety of techniques to find them. However, none of the existing tools use a principled approach based on formal definitions of cause and explanation. In this paper we present a novel black-box approach to computing explanations grounded in the theory of actual causality. We prove relevant theoretical results and present an algorithm for computing approximate explanations based on these definitions. We prove termination of our algorithm and discuss its complexity and the amount of approximation compared to the precise definition. We implemented the framework in a tool ReX and we present experimental results and a comparison with state-of-the-art tools. We demonstrate that ReX is the most efficient black-box tool and produces the smallest explanations, in addition to outperforming other black-box tools on standard quality measures.

Causal Explanations for Image Classifiers

TL;DR

It is demonstrated that ReX is the most efficient black-box tool and produces the smallest explanations, in addition to outperforming other black-box tools on standard quality measures.

Abstract

Existing algorithms for explaining the output of image classifiers use different definitions of explanations and a variety of techniques to find them. However, none of the existing tools use a principled approach based on formal definitions of cause and explanation. In this paper we present a novel black-box approach to computing explanations grounded in the theory of actual causality. We prove relevant theoretical results and present an algorithm for computing approximate explanations based on these definitions. We prove termination of our algorithm and discuss its complexity and the amount of approximation compared to the precise definition. We implemented the framework in a tool ReX and we present experimental results and a comparison with state-of-the-art tools. We demonstrate that ReX is the most efficient black-box tool and produces the smallest explanations, in addition to outperforming other black-box tools on standard quality measures.

Paper Structure

This paper contains 17 sections, 6 theorems, 4 equations, 13 figures, 3 tables, 3 algorithms.

Key Result

Lemma 5.2

The number of calls of algo:compositonal_explanation to the model is $O(2^s n N)$, where $s$ is the size of the partition in each step (in our setting $s=4$), $n$ is the number of pixels in the original image $x$, and $N$ is the number of initial partitions.

Figures (13)

  • Figure 1: A ladybug (\ref{['fig:accept:ladybird']}), its responsibility map (\ref{['fig:accept:surface']}), the heat map (\ref{['fig:accept:heat']}), which is a projection of the responsibility map on a plane overlaid on the original image, and a causal explanation (\ref{['fig:accept:exp']}). The minimal causal explanation computed by our tool re x is less than $1\%$ of the image.
  • Figure 2: A depth-$2$ binary causal model $M_{\mathcal{N},x}$ for an image $x$ and a classifier $\mathcal{N}$. $\vec{v}$ is the vector of values of $\vec{V}$. The output $O \in \{0, 1\}$ indicates whether the classification of the Hadamard product of $x$ and $\vec{V}$ is the same as the original classification.
  • Figure 3: High level overview of \ref{['algo:compositonal_explanation']}. The causal ranking algorithm produces an approximate responsibility map (①). The pixels in the image are then ordered by their approximate responsibility, and the explanation extraction algorithm uses this ranking to produce an approximately minimal sufficient explanation (②), which captures the information required for the DNN to give the classification ('starfish' in this example).
  • Figure 4: The re x algorithm in action: re x creates an initial random partition of an image into $4$ sections (①). All combinations of these sections are queried by the model, with further refinement applied to those sections or combinations of sections that meet the requirements (②). Some combinations, highlighted in green, are classified as bus, others, in red, are not. This results, after several iterations, in a detailed responsibility map (③), from which a minimal passing explanations can be extracted (④).
  • Figure 5: Improvement of re x's pixel ranking on 'bus' (\ref{['fig:rex_example']}) as the number of iterations $N$ increases (\ref{['algo:compositonal_explanation']})
  • ...and 8 more figures

Theorems & Definitions (17)

  • Definition 4.0: Single-Context Explanation
  • Definition 4.0: Sufficient responsibility
  • Lemma 5.2
  • proof
  • Definition A.0: Single-Context Explanation
  • Definition A.1
  • Lemma A.2
  • proof
  • Definition A.2: Sufficient responsibility
  • Definition A.3
  • ...and 7 more