Table of Contents
Fetching ...

Segmentation and Smoothing Affect Explanation Quality More Than the Choice of Perturbation-based XAI Method for Image Explanations

Gustav Grund Pihlgren, Kary Främling

TL;DR

This paper analyzes perturbation-based image explanations to determine which pipeline parameters most influence explanation quality. By exhaustively combining segmentation, sampling, perturbation, and attribution choices and evaluating against the SRG proxy on ImageNet with three CNNs, it finds that segmentation and per-pixel attribution dominate performance while the specific attribution calculation method has a smaller effect. The study introduces Gaussian smoothing for the perturbation mask to enable alternative segmentations (notably SLIC), adapts PDA to multi-segment perturbations, and demonstrates that per-pixel attribution with improved segmentation yields the best SRG gains. These insights challenge common assumptions about attribution methods and highlight the importance of the entire pipeline design for reliable explanations, with implications for efficiency and human interpretability in XAI systems.

Abstract

Perturbation-based post-hoc image explanation methods are commonly used to explain image prediction models. These methods perturb parts of the input to measure how those parts affect the output. Since the methods only require the input and output, they can be applied to any model, making them a popular choice to explain black-box models. While many different methods exist and have been compared with one another, it remains poorly understood which parameters of the different methods are responsible for their varying performance. This work uses the Randomized Input Sampling for Explanations (RISE) method as a baseline to evaluate many combinations of mask sampling, segmentation techniques, smoothing, attribution calculation, and per-segment or per-pixel attribution, using a proxy metric. The results show that attribution calculation, which is frequently the focus of other works, has little impact on the results. Conversely, segmentation and per-pixel attribution, rarely examined parameters, have a significant impact. The implementation of and data gathered in this work are available online: https://github.com/guspih/post-hoc-image-perturbation and https://bit.ly/smooth-mask-perturbation.

Segmentation and Smoothing Affect Explanation Quality More Than the Choice of Perturbation-based XAI Method for Image Explanations

TL;DR

This paper analyzes perturbation-based image explanations to determine which pipeline parameters most influence explanation quality. By exhaustively combining segmentation, sampling, perturbation, and attribution choices and evaluating against the SRG proxy on ImageNet with three CNNs, it finds that segmentation and per-pixel attribution dominate performance while the specific attribution calculation method has a smaller effect. The study introduces Gaussian smoothing for the perturbation mask to enable alternative segmentations (notably SLIC), adapts PDA to multi-segment perturbations, and demonstrates that per-pixel attribution with improved segmentation yields the best SRG gains. These insights challenge common assumptions about attribution methods and highlight the importance of the entire pipeline design for reliable explanations, with implications for efficiency and human interpretability in XAI systems.

Abstract

Perturbation-based post-hoc image explanation methods are commonly used to explain image prediction models. These methods perturb parts of the input to measure how those parts affect the output. Since the methods only require the input and output, they can be applied to any model, making them a popular choice to explain black-box models. While many different methods exist and have been compared with one another, it remains poorly understood which parameters of the different methods are responsible for their varying performance. This work uses the Randomized Input Sampling for Explanations (RISE) method as a baseline to evaluate many combinations of mask sampling, segmentation techniques, smoothing, attribution calculation, and per-segment or per-pixel attribution, using a proxy metric. The results show that attribution calculation, which is frequently the focus of other works, has little impact on the results. Conversely, segmentation and per-pixel attribution, rarely examined parameters, have a significant impact. The implementation of and data gathered in this work are available online: https://github.com/guspih/post-hoc-image-perturbation and https://bit.ly/smooth-mask-perturbation.
Paper Structure (11 sections, 4 figures, 4 tables)

This paper contains 11 sections, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The pipeline for perturbation-based image attribution used in this work. The image is segmented, samples indicating what segments to perturb are drawn, the sampled segments are perturbed, the model to explain makes predictions for the perturbed samples, and the input-output pairs are used to calculate per-segment and per-pixel attribution.
  • Figure 2: Showcase of how LIF, MIF, and SRG metrics are calculated by steadily occluding the least or most influential pixels of an image and calculating the value of the top class predicted for the original image.
  • Figure 3: Attribution maps per attribution method overlaid an image from the ostrich class explaining the correct prediction by AlexNet. The maps are attributed per-pixel and use grid+bilinear segmentiation and entropic sampling with a sample size of 400. The pixels are colored in an even spectrum between red and blue according to their rank from most attribution to least.
  • Figure 4: Per-segment and per-pixel attribution maps generated using both Grid+Gaussian and SLIC+Gaussian segmentation overlaid an image of a snail, explaining the correct prediction by AlexNet. The maps were generated using "only one" sampling and are equivalent across all attribution methods. The pixels are colored in an even spectrum between red and blue according to their rank from highest attribution to lowest.