Table of Contents
Fetching ...

AdaptGrad: Adaptive Sampling to Reduce Noise

Linjiang Zhou, Chao Ma, Zepeng Wang, Libing Wu, Xiaochuan Shi

TL;DR

AdaptGrad rethinks gradient smoothing by interpreting SmoothGrad as a Gaussian convolution of the gradient and identifying extra noise from out-of-bounds sampling within bounded data domains. It then derives an adaptive sampling scheme that selects per-dimension noise scales to bound the probability of sampling outside the data domain at a chosen confidence level $c$, yielding a simple, model-agnostic method that improves denoising while preserving details. Through qualitative and quantitative experiments on MNIST and ImageNet with multiple architectures and attribution methods, AdaptGrad consistently enhances Sparseness and Faithfulness and shows favorable downstream task performance in object localization and adversarial sample generation. The approach achieves these gains with only modest computational overhead and remains compatible with existing gradient-based explanations, offering a practical, principled upgrade to gradient smoothing in XAI pipelines.

Abstract

Gradient Smoothing is an efficient approach to reducing noise in gradient-based model explanation method. SmoothGrad adds Gaussian noise to mitigate much of these noise. However, the crucial hyper-parameter in this method, the variance $σ$ of Gaussian noise, is set manually or with heuristic approach. However, it results in the smoothed gradients still containing a certain amount of noise. In this paper, we aim to interpret SmoothGrad as a corollary of convolution, thereby re-understanding the gradient noise and the role of $σ$ from the perspective of confidence level. Furthermore, we propose an adaptive gradient smoothing method, AdaptGrad, based on these insights. Through comprehensive experiments, both qualitative and quantitative results demonstrate that AdaptGrad could effectively reduce almost all the noise in vanilla gradients compared with baselines methods. AdaptGrad is simple and universal, making it applicable for enhancing gradient-based interpretability methods for better visualization.

AdaptGrad: Adaptive Sampling to Reduce Noise

TL;DR

AdaptGrad rethinks gradient smoothing by interpreting SmoothGrad as a Gaussian convolution of the gradient and identifying extra noise from out-of-bounds sampling within bounded data domains. It then derives an adaptive sampling scheme that selects per-dimension noise scales to bound the probability of sampling outside the data domain at a chosen confidence level , yielding a simple, model-agnostic method that improves denoising while preserving details. Through qualitative and quantitative experiments on MNIST and ImageNet with multiple architectures and attribution methods, AdaptGrad consistently enhances Sparseness and Faithfulness and shows favorable downstream task performance in object localization and adversarial sample generation. The approach achieves these gains with only modest computational overhead and remains compatible with existing gradient-based explanations, offering a practical, principled upgrade to gradient smoothing in XAI pipelines.

Abstract

Gradient Smoothing is an efficient approach to reducing noise in gradient-based model explanation method. SmoothGrad adds Gaussian noise to mitigate much of these noise. However, the crucial hyper-parameter in this method, the variance of Gaussian noise, is set manually or with heuristic approach. However, it results in the smoothed gradients still containing a certain amount of noise. In this paper, we aim to interpret SmoothGrad as a corollary of convolution, thereby re-understanding the gradient noise and the role of from the perspective of confidence level. Furthermore, we propose an adaptive gradient smoothing method, AdaptGrad, based on these insights. Through comprehensive experiments, both qualitative and quantitative results demonstrate that AdaptGrad could effectively reduce almost all the noise in vanilla gradients compared with baselines methods. AdaptGrad is simple and universal, making it applicable for enhancing gradient-based interpretability methods for better visualization.

Paper Structure

This paper contains 23 sections, 12 equations, 17 figures, 11 tables.

Figures (17)

  • Figure 1: An example to compare the visual performance between different gradient smoothing methods.
  • Figure 2: The visual saliency map $G_{sg}$ of SmoothGrad with different sampling number $N$ and $\alpha=0.2$. The classification model is VGG16 simonyan2014very, and this image is from ILSVRC2012 krizhevsky2012imagenet.
  • Figure 3: An example of the relationship between out-of-bound sampling behavior and extra noise. By adding a bias to the input image (illustrated as Bias in the figure), the out-of-bound sampling behavior of SmoothGrad increases progressively (the proportion of out-of-bound pixels / the value of out-of-bound and labeled at the bottom of the saliency map).
  • Figure 4: The visual saliency map $G_{ag}$ of AdaptGrad with different extra noise level $c$. Other settings are the same as \ref{['fig:boat']}.
  • Figure 5: The visual saliency map from VGG16 of SmoothGrad and AdaptGrad with different sample times $N$.
  • ...and 12 more figures