Table of Contents
Fetching ...

Robustness of Visual Explanations to Common Data Augmentation

Lenka Tětková, Lars Kai Hansen

TL;DR

This paper addresses the reliability of post-hoc visual explanations under naturally occurring data augmentations. By partitioning augmentations into invariant and equivariant groups and evaluating multiple explanation methods across CNN architectures on ImageNet, it introduces a stability metric $S( ext{correlation}, ext{probability})$ and a pixel-flipping faithfulness score to quantify robustness and fidelity. The findings show explanations are generally less robust than model predictions, with invariant transformations yielding more stable attributions than equivariant ones; among methods, LRP composites and Guided Backpropagation provide the best stability, while Gradients-based methods are the least robust. Training with augmented data does not fully resolve instability, underscoring the need for more robust explanation techniques before deploying them in real-world vision tasks.

Abstract

As the use of deep neural networks continues to grow, understanding their behaviour has become more crucial than ever. Post-hoc explainability methods are a potential solution, but their reliability is being called into question. Our research investigates the response of post-hoc visual explanations to naturally occurring transformations, often referred to as augmentations. We anticipate explanations to be invariant under certain transformations, such as changes to the colour map while responding in an equivariant manner to transformations like translation, object scaling, and rotation. We have found remarkable differences in robustness depending on the type of transformation, with some explainability methods (such as LRP composites and Guided Backprop) being more stable than others. We also explore the role of training with data augmentation. We provide evidence that explanations are typically less robust to augmentation than classification performance, regardless of whether data augmentation is used in training or not.

Robustness of Visual Explanations to Common Data Augmentation

TL;DR

This paper addresses the reliability of post-hoc visual explanations under naturally occurring data augmentations. By partitioning augmentations into invariant and equivariant groups and evaluating multiple explanation methods across CNN architectures on ImageNet, it introduces a stability metric and a pixel-flipping faithfulness score to quantify robustness and fidelity. The findings show explanations are generally less robust than model predictions, with invariant transformations yielding more stable attributions than equivariant ones; among methods, LRP composites and Guided Backpropagation provide the best stability, while Gradients-based methods are the least robust. Training with augmented data does not fully resolve instability, underscoring the need for more robust explanation techniques before deploying them in real-world vision tasks.

Abstract

As the use of deep neural networks continues to grow, understanding their behaviour has become more crucial than ever. Post-hoc explainability methods are a potential solution, but their reliability is being called into question. Our research investigates the response of post-hoc visual explanations to naturally occurring transformations, often referred to as augmentations. We anticipate explanations to be invariant under certain transformations, such as changes to the colour map while responding in an equivariant manner to transformations like translation, object scaling, and rotation. We have found remarkable differences in robustness depending on the type of transformation, with some explainability methods (such as LRP composites and Guided Backprop) being more stable than others. We also explore the role of training with data augmentation. We provide evidence that explanations are typically less robust to augmentation than classification performance, regardless of whether data augmentation is used in training or not.
Paper Structure (13 sections, 18 figures, 3 tables)

This paper contains 13 sections, 18 figures, 3 tables.

Figures (18)

  • Figure 1: Visualization of the metrics defined in \ref{['sec:methods_metrics']}. In both cases, we compute the portion of the yellow part in the green rectangle.
  • Figure 2: Example of curves showing the probabilities and correlations between the original and the rotated images.
  • Figure 3: Examples of the augmented images and their explanations.
  • Figure 4: Comparison of S(corelation, probability) and pixel flipping score for "ResNet50 full aug". The scores are defined in \ref{['sec:methods_metrics']}. The x-axis shows the average of the S(corelation, probability) for all six augmentation methods used in this paper. The dashed line corresponds to a baseline pixel-flipping score computed with random sorting of the pixels. The best methods are in the top right corner.
  • Figure 5: Comparison of "ResNet50 full aug" (391 images) and "ResNet50 lim aug" (385 images) for each explainability method. We plot S(corelation, probability) for changes in brightness (AddToBrightness from -95 to 95). Boxes show the quartiles and medians, and whiskers extend to the most extreme, non-outlier data points.
  • ...and 13 more figures