Table of Contents
Fetching ...

Reproducibility review of "Why Not Other Classes": Towards Class-Contrastive Back-Propagation Explanations

Arvid Eriksson, Anton Israelsson, Mattias Kallhauge

TL;DR

This work provides a thorough reproducibility analysis of class-contrastive back-propagation explanations for neural image classifiers, extending the original approach to post-softmax backprop methods, XGradCAM, FullGrad, and Vision Transformers. By correcting mathematical gaps, validating across multiple backprop variants, and applying the method to ViTs with attention-rollout, the study demonstrates that contrastive explanations can better discriminate among competing classes and maintain generalizability. Key contributions include an open-source repository, clarified methodology, and empirical support for the claim that back-propagating from the softmax neuron $p_t$ yields more interpretable, targeted heatmaps than standard logits. The work highlights reproducibility challenges in explainability research and provides a practical, transferable framework for class-contrastive explanations with potential impact on debugging and model transparency in high-stakes settings.

Abstract

"Why Not Other Classes?": Towards Class-Contrastive Back-Propagation Explanations (Wang & Wang, 2022) provides a method for contrastively explaining why a certain class in a neural network image classifier is chosen above others. This method consists of using back-propagation-based explanation methods from after the softmax layer rather than before. Our work consists of reproducing the work in the original paper. We also provide extensions to the paper by evaluating the method on XGradCAM, FullGrad, and Vision Transformers to evaluate its generalization capabilities. The reproductions show similar results as the original paper, with the only difference being the visualization of heatmaps which could not be reproduced to look similar. The generalization seems to be generally good, with implementations working for Vision Transformers and alternative back-propagation methods. We also show that the original paper suffers from issues such as a lack of detail in the method and an erroneous equation which makes reproducibility difficult. To remedy this we provide an open-source repository containing all code used for this project.

Reproducibility review of "Why Not Other Classes": Towards Class-Contrastive Back-Propagation Explanations

TL;DR

This work provides a thorough reproducibility analysis of class-contrastive back-propagation explanations for neural image classifiers, extending the original approach to post-softmax backprop methods, XGradCAM, FullGrad, and Vision Transformers. By correcting mathematical gaps, validating across multiple backprop variants, and applying the method to ViTs with attention-rollout, the study demonstrates that contrastive explanations can better discriminate among competing classes and maintain generalizability. Key contributions include an open-source repository, clarified methodology, and empirical support for the claim that back-propagating from the softmax neuron yields more interpretable, targeted heatmaps than standard logits. The work highlights reproducibility challenges in explainability research and provides a practical, transferable framework for class-contrastive explanations with potential impact on debugging and model transparency in high-stakes settings.

Abstract

"Why Not Other Classes?": Towards Class-Contrastive Back-Propagation Explanations (Wang & Wang, 2022) provides a method for contrastively explaining why a certain class in a neural network image classifier is chosen above others. This method consists of using back-propagation-based explanation methods from after the softmax layer rather than before. Our work consists of reproducing the work in the original paper. We also provide extensions to the paper by evaluating the method on XGradCAM, FullGrad, and Vision Transformers to evaluate its generalization capabilities. The reproductions show similar results as the original paper, with the only difference being the visualization of heatmaps which could not be reproduced to look similar. The generalization seems to be generally good, with implementations working for Vision Transformers and alternative back-propagation methods. We also show that the original paper suffers from issues such as a lack of detail in the method and an erroneous equation which makes reproducibility difficult. To remedy this we provide an open-source repository containing all code used for this project.
Paper Structure (19 sections, 7 equations, 8 figures, 2 tables)

This paper contains 19 sections, 7 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Reproducing of Figure 3 in the original paper with $\epsilon=3.0e-3$. Changes in accuracy, $y_t$ and $p_t$ (t is the target classification class) when certain input features are perturbed. Perturbed features are selected based on four gradient explanations (original, mean, max and weighted), where original is directly with respect to the gradients of the logits.
  • Figure 2: Reproduction of Figure 4 in the original paper. Comparison between the back-propagation from logits $y_t$ (Original) and weighted contrastive back-propagation from $p_t$ (Weighted) for GradCAM, Linear Approximation, and XGradCAM. The columns for each image signify the most possible and second possible class, respectively. Red and blue signal positive and negative activations respectively.
  • Figure 3: Comparison between the back-propagation from logits $y_t$ (Original) and weighted contrastive back-propagation from $p_t$ (Weighted) for FullGrad. The columns for each image signify the most possible and second possible class, respectively. Red and blue signal positive and negative activations respectively after normalization.
  • Figure 4: Reproduction of Figure 5 in the original paper. Comparison between mean, max, and weighted contrast for four images from CUB-200. In each column, we present explanations for the three most probable classes for GradCAM using the original image and the three contrastive methods.
  • Figure 5: Comparison between proposed explanations. In (a) a comparison between GradCAM, GradCAM without ReLU, and Contrastive GradCAM is considered with target attention layer 8 and 10 respectively. In (b) a comparison between Gradient-weighted Attention rollout (GWAR) of the standard, without ReLU, and contrastive variant is considered. Red sections are considered areas with high explainability. To adapt the method to the contrastive version all ReLU operations were removed and the gradients were calculated from the softmax output instead of the logits.
  • ...and 3 more figures