Table of Contents
Fetching ...

Rethinking Visual Counterfactual Explanations Through Region Constraint

Bartlomiej Sobieski, Jakub Grzywaczewski, Bartlomiej Sadlej, Matthew Tivnan, Przemyslaw Biecek

TL;DR

This work proposes region-constrained VCEs (RVCEs), which assume that only a predefined image region can be modified to influence the model's prediction, and extends RCSB to allow for exact counterfactual reasoning.

Abstract

Visual counterfactual explanations (VCEs) have recently gained immense popularity as a tool for clarifying the decision-making process of image classifiers. This trend is largely motivated by what these explanations promise to deliver -- indicate semantically meaningful factors that change the classifier's decision. However, we argue that current state-of-the-art approaches lack a crucial component -- the region constraint -- whose absence prevents from drawing explicit conclusions, and may even lead to faulty reasoning due to phenomenons like confirmation bias. To address the issue of previous methods, which modify images in a very entangled and widely dispersed manner, we propose region-constrained VCEs (RVCEs), which assume that only a predefined image region can be modified to influence the model's prediction. To effectively sample from this subclass of VCEs, we propose Region-Constrained Counterfactual Schrödinger Bridges (RCSB), an adaptation of a tractable subclass of Schrödinger Bridges to the problem of conditional inpainting, where the conditioning signal originates from the classifier of interest. In addition to setting a new state-of-the-art by a large margin, we extend RCSB to allow for exact counterfactual reasoning, where the predefined region contains only the factor of interest, and incorporating the user to actively interact with the RVCE by predefining the regions manually.

Rethinking Visual Counterfactual Explanations Through Region Constraint

TL;DR

This work proposes region-constrained VCEs (RVCEs), which assume that only a predefined image region can be modified to influence the model's prediction, and extends RCSB to allow for exact counterfactual reasoning.

Abstract

Visual counterfactual explanations (VCEs) have recently gained immense popularity as a tool for clarifying the decision-making process of image classifiers. This trend is largely motivated by what these explanations promise to deliver -- indicate semantically meaningful factors that change the classifier's decision. However, we argue that current state-of-the-art approaches lack a crucial component -- the region constraint -- whose absence prevents from drawing explicit conclusions, and may even lead to faulty reasoning due to phenomenons like confirmation bias. To address the issue of previous methods, which modify images in a very entangled and widely dispersed manner, we propose region-constrained VCEs (RVCEs), which assume that only a predefined image region can be modified to influence the model's prediction. To effectively sample from this subclass of VCEs, we propose Region-Constrained Counterfactual Schrödinger Bridges (RCSB), an adaptation of a tractable subclass of Schrödinger Bridges to the problem of conditional inpainting, where the conditioning signal originates from the classifier of interest. In addition to setting a new state-of-the-art by a large margin, we extend RCSB to allow for exact counterfactual reasoning, where the predefined region contains only the factor of interest, and incorporating the user to actively interact with the RVCE by predefining the regions manually.

Paper Structure

This paper contains 24 sections, 1 theorem, 24 equations, 24 figures, 6 tables, 4 algorithms.

Key Result

Theorem 1

If $\widehat{\boldsymbol{\Psi}},\boldsymbol{\Psi}$ fulfill the constraints given by eq:sb_pdes_coupling, then $\nabla_{\mathbf{x}_t} \log \widehat{\boldsymbol{\Psi}}(\mathbf{x}_t, t),\nabla_{\mathbf{x}_t} \log \boldsymbol{\Psi}(\mathbf{x}_t, t)$ are the score functions of the following linear SDEs,

Figures (24)

  • Figure 1: Previous methods create VCEs with unconstrained changes, making it virtually impossible to understand the decision-making process of a model. We propose region-constrained VCEs, establishing a new paradigm for comprehensible and actionable explanatory process.
  • Figure 2: Generative trajectories of I2SB and SGM. Intermediate images of I2SB are much closer to the data manifold.
  • Figure 3: Series of proposed improvements to better align the gradient's of the classifier of interest with the generative trajectory. Changes to the factual image are constrained to the indicated region. Subsequent images illustrate the influence of each new adaptation. Numbers below images correspond to FID ($\downarrow$) values obtained in a larger-scale experiment (for details, see Appendix).
  • Figure 4: Example region obtained with our automated region extraction. Instead of directly binarizing an attribution map (upper row), we amplify the focus on semantic concepts (bottom row) with a simple approach based on grid cells.
  • Figure 5: Qualitative examples obtained with RCSB using automated region extraction. Each task of the form predicted class$\rightarrow$target class shows the factual image, the extracted region and the RVCE obtained with RCSB.
  • ...and 19 more figures

Theorems & Definitions (1)

  • Theorem 1: Reformulating SB drifts as score functions liu20232