Table of Contents
Fetching ...

Smooth Deep Saliency

Rudolf Herdt, Maximilian Schmidt, Daniel Otero Baguer, Peter Maaß

TL;DR

This paper tackles the checkerboard noise in gradient-based saliency maps arising from stride-2 convolution by computing saliency in hidden layers. It proposes three approaches—a bilinear surrogate path, a backward-hook, and a forward-hook—to reduce gradient noise while preserving model fidelity, and evaluates them across ImageNet1K, Camelyon16, and Digipath. The bilinear surrogate achieves substantial smoothing with minimal accuracy loss, while the backward hook offers a no-training alternative with good results; forward hooks yield smooth saliency but poorer faithfulness. The work demonstrates smoother, more interpretable saliency maps that can enhance clinical imaging analyses and deep learning transparency, with broader applicability to CNNs employing downsampling.

Abstract

In this work, we investigate methods to reduce the noise in deep saliency maps coming from convolutional downsampling. Those methods make the investigated models more interpretable for gradient-based saliency maps, computed in hidden layers. We evaluate the faithfulness of those methods using insertion and deletion metrics, finding that saliency maps computed in hidden layers perform better compared to both the input layer and GradCAM. We test our approach on different models trained for image classification on ImageNet1K, and models trained for tumor detection on Camelyon16 and in-house real-world digital pathology scans of stained tissue samples. Our results show that the checkerboard noise in the gradient gets reduced, resulting in smoother and therefore easier to interpret saliency maps.

Smooth Deep Saliency

TL;DR

This paper tackles the checkerboard noise in gradient-based saliency maps arising from stride-2 convolution by computing saliency in hidden layers. It proposes three approaches—a bilinear surrogate path, a backward-hook, and a forward-hook—to reduce gradient noise while preserving model fidelity, and evaluates them across ImageNet1K, Camelyon16, and Digipath. The bilinear surrogate achieves substantial smoothing with minimal accuracy loss, while the backward hook offers a no-training alternative with good results; forward hooks yield smooth saliency but poorer faithfulness. The work demonstrates smoother, more interpretable saliency maps that can enhance clinical imaging analyses and deep learning transparency, with broader applicability to CNNs employing downsampling.

Abstract

In this work, we investigate methods to reduce the noise in deep saliency maps coming from convolutional downsampling. Those methods make the investigated models more interpretable for gradient-based saliency maps, computed in hidden layers. We evaluate the faithfulness of those methods using insertion and deletion metrics, finding that saliency maps computed in hidden layers perform better compared to both the input layer and GradCAM. We test our approach on different models trained for image classification on ImageNet1K, and models trained for tumor detection on Camelyon16 and in-house real-world digital pathology scans of stained tissue samples. Our results show that the checkerboard noise in the gradient gets reduced, resulting in smoother and therefore easier to interpret saliency maps.
Paper Structure (23 sections, 2 equations, 21 figures, 6 tables)

This paper contains 23 sections, 2 equations, 21 figures, 6 tables.

Figures (21)

  • Figure 1: Checkerboard pattern in the gradient of the $16\times16$ image.
  • Figure 2: Training the bilinear surrogate. The weights of the two $3\times3$ stride 1 convolutions are not shared, they are different convolutions.
  • Figure 3: Using the bilinear surrogate path in evaluation. The convolutional downsamplings will be ignored, they are replaced by the bilinear surrogate.
  • Figure 4: Insertion and Deletion metrics using DeepLift.
  • Figure 5: Insertion and Deletion scores for randomizing the model, using DeepLift as attribution method.
  • ...and 16 more figures