Table of Contents
Fetching ...

Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks

Rachel Lea Draelos, Lawrence Carin

TL;DR

The paper identifies a fundamental fidelity gap in Grad-CAM due to gradient averaging and introduces HiResCAM, a class-specific explanation method that faithfully reflects the regions the model actually uses to make predictions. It proves HiResCAM generalizes CAM and provides theoretical connections to regression and Gradient × Input, with rigorous results for CNNs ending in a single fully connected layer and for CAM architectures. Empirically, HiResCAM yields explanations that align with model computations on PASCAL VOC 2012, while Grad-CAM often expands attention; human studies and medical imaging examples further illustrate the trade-off between fidelity and segmentation utility. The work argues for adopting HiResCAM in sensitive contexts where trustworthy explanations are essential, while noting Grad-CAM's practical value for weakly supervised segmentation tasks.

Abstract

Explanation methods facilitate the development of models that learn meaningful concepts and avoid exploiting spurious correlations. We illustrate a previously unrecognized limitation of the popular neural network explanation method Grad-CAM: as a side effect of the gradient averaging step, Grad-CAM sometimes highlights locations the model did not actually use. To solve this problem, we propose HiResCAM, a novel class-specific explanation method that is guaranteed to highlight only the locations the model used to make each prediction. We prove that HiResCAM is a generalization of CAM and explore the relationships between HiResCAM and other gradient-based explanation methods. Experiments on PASCAL VOC 2012, including crowd-sourced evaluations, illustrate that while HiResCAM's explanations faithfully reflect the model, Grad-CAM often expands the attention to create bigger and smoother visualizations. Overall, this work advances convolutional neural network explanation approaches and may aid in the development of trustworthy models for sensitive applications.

Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks

TL;DR

The paper identifies a fundamental fidelity gap in Grad-CAM due to gradient averaging and introduces HiResCAM, a class-specific explanation method that faithfully reflects the regions the model actually uses to make predictions. It proves HiResCAM generalizes CAM and provides theoretical connections to regression and Gradient × Input, with rigorous results for CNNs ending in a single fully connected layer and for CAM architectures. Empirically, HiResCAM yields explanations that align with model computations on PASCAL VOC 2012, while Grad-CAM often expands attention; human studies and medical imaging examples further illustrate the trade-off between fidelity and segmentation utility. The work argues for adopting HiResCAM in sensitive contexts where trustworthy explanations are essential, while noting Grad-CAM's practical value for weakly supervised segmentation tasks.

Abstract

Explanation methods facilitate the development of models that learn meaningful concepts and avoid exploiting spurious correlations. We illustrate a previously unrecognized limitation of the popular neural network explanation method Grad-CAM: as a side effect of the gradient averaging step, Grad-CAM sometimes highlights locations the model did not actually use. To solve this problem, we propose HiResCAM, a novel class-specific explanation method that is guaranteed to highlight only the locations the model used to make each prediction. We prove that HiResCAM is a generalization of CAM and explore the relationships between HiResCAM and other gradient-based explanation methods. Experiments on PASCAL VOC 2012, including crowd-sourced evaluations, illustrate that while HiResCAM's explanations faithfully reflect the model, Grad-CAM often expands the attention to create bigger and smoother visualizations. Overall, this work advances convolutional neural network explanation approaches and may aid in the development of trustworthy models for sensitive applications.

Paper Structure

This paper contains 38 sections, 11 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Grad-CAM and HiResCAM produce different explanations for the same model, image, and class. HiResCAM provably reflects the locations the model used for computation, while Grad-CAM does not. (A) HiResCAM explanations are often more focal than Grad-CAM explanations. (B) Sometimes HiResCAM highlights the correct object while Grad-CAM does not; (C) other times, HiResCAM highlights more parts of the image than Grad-CAM. In this example, the Grad-CAM explanation gives the impression that the model predicted "potted plant" from the plant alone, but the HiResCAM explanation reveals that the model also attended to other parts of the image. Best viewed in color.
  • Figure 2: 2D example of how HiResCAM addresses the limitation of the gradient averaging step in Grad-CAM. The Grad-CAM explanation (equation \ref{['eqn:vanillagradcam']}) matches the relative magnitudes and positive-negative pattern of the original feature map (the "inverted red L shape" here), even though the gradients suggest that some elements should be re-scaled and/or change sign. HiResCAM (equation \ref{['eqn:hirescam']}) does not average over the gradients and instead element-wise multiplies the feature map with the gradients directly, thereby producing attention that reflects the model's computations and emphasizes the most important locations for a particular prediction. Best viewed in color.
  • Figure 3: Specific example demonstrating that for CNNs ending in a single fully connected layer, HiResCAM explanations directly reflect the calculation of the class score while Grad-CAM explanations do not. Integer input values were chosen for simplicity; actual weights and activations are not integers.
  • Figure 4: HiResCAM and Grad-CAM explanations for 72 randomly-selected PASCAL VOC 2012 validation set image-class pairs for ResNet-34v. Each half of the figure includes the same images and classes. The two halves differ only in the calculation of the explanation. Best viewed in color.
  • Figure 5: Example of a visualization used in the AMT human evaluation task comparing Grad-CAM and HiResCAM explanations produced using the same model on the same input image and class. HiResCAM appeared as Image A in 50% of the visualizations and as Image B in the remaining 50%. Best viewed in color.
  • ...and 3 more figures

Theorems & Definitions (2)

  • proof
  • proof