Smooth Deep Saliency
Rudolf Herdt, Maximilian Schmidt, Daniel Otero Baguer, Peter Maaß
TL;DR
This paper tackles the checkerboard noise in gradient-based saliency maps arising from stride-2 convolution by computing saliency in hidden layers. It proposes three approaches—a bilinear surrogate path, a backward-hook, and a forward-hook—to reduce gradient noise while preserving model fidelity, and evaluates them across ImageNet1K, Camelyon16, and Digipath. The bilinear surrogate achieves substantial smoothing with minimal accuracy loss, while the backward hook offers a no-training alternative with good results; forward hooks yield smooth saliency but poorer faithfulness. The work demonstrates smoother, more interpretable saliency maps that can enhance clinical imaging analyses and deep learning transparency, with broader applicability to CNNs employing downsampling.
Abstract
In this work, we investigate methods to reduce the noise in deep saliency maps coming from convolutional downsampling. Those methods make the investigated models more interpretable for gradient-based saliency maps, computed in hidden layers. We evaluate the faithfulness of those methods using insertion and deletion metrics, finding that saliency maps computed in hidden layers perform better compared to both the input layer and GradCAM. We test our approach on different models trained for image classification on ImageNet1K, and models trained for tumor detection on Camelyon16 and in-house real-world digital pathology scans of stained tissue samples. Our results show that the checkerboard noise in the gradient gets reduced, resulting in smoother and therefore easier to interpret saliency maps.
