Table of Contents
Fetching ...

A Super-pixel-based Approach to the Stable Interpretation of Neural Networks

Shizhan Gong, Jingwei Zhang, Qi Dou, Farzan Farnia

TL;DR

This work tackles the instability of gradient-based saliency maps caused by stochastic training. It introduces a semantically informed pixel grouping via super-pixels, enabling a grouped gradient approach that reduces variance and improves generalization of explanations. The authors provide theoretical stability guarantees and demonstrate, on CIFAR-10 and ImageNet, that super-pixel saliency maps offer higher stability, better generalization (MeGe), and enhanced interpretability (ROAR/ROAD) with only modest fidelity trade-offs. The method remains computationally efficient and complements existing gradient-based approaches like SmoothGrad and Grad-CAM, with potential applicability beyond image data.

Abstract

Saliency maps are widely used in the computer vision community for interpreting neural network classifiers. However, due to the randomness of training samples and optimization algorithms, the resulting saliency maps suffer from a significant level of stochasticity, making it difficult for domain experts to capture the intrinsic factors that influence the neural network's decision. In this work, we propose a novel pixel partitioning strategy to boost the stability and generalizability of gradient-based saliency maps. Through both theoretical analysis and numerical experiments, we demonstrate that the grouping of pixels reduces the variance of the saliency map and improves the generalization behavior of the interpretation method. Furthermore, we propose a sensible grouping strategy based on super-pixels which cluster pixels into groups that align well with the semantic meaning of the images. We perform several numerical experiments on CIFAR-10 and ImageNet. Our empirical results suggest that the super-pixel-based interpretation maps consistently improve the stability and quality over the pixel-based saliency maps.

A Super-pixel-based Approach to the Stable Interpretation of Neural Networks

TL;DR

This work tackles the instability of gradient-based saliency maps caused by stochastic training. It introduces a semantically informed pixel grouping via super-pixels, enabling a grouped gradient approach that reduces variance and improves generalization of explanations. The authors provide theoretical stability guarantees and demonstrate, on CIFAR-10 and ImageNet, that super-pixel saliency maps offer higher stability, better generalization (MeGe), and enhanced interpretability (ROAR/ROAD) with only modest fidelity trade-offs. The method remains computationally efficient and complements existing gradient-based approaches like SmoothGrad and Grad-CAM, with potential applicability beyond image data.

Abstract

Saliency maps are widely used in the computer vision community for interpreting neural network classifiers. However, due to the randomness of training samples and optimization algorithms, the resulting saliency maps suffer from a significant level of stochasticity, making it difficult for domain experts to capture the intrinsic factors that influence the neural network's decision. In this work, we propose a novel pixel partitioning strategy to boost the stability and generalizability of gradient-based saliency maps. Through both theoretical analysis and numerical experiments, we demonstrate that the grouping of pixels reduces the variance of the saliency map and improves the generalization behavior of the interpretation method. Furthermore, we propose a sensible grouping strategy based on super-pixels which cluster pixels into groups that align well with the semantic meaning of the images. We perform several numerical experiments on CIFAR-10 and ImageNet. Our empirical results suggest that the super-pixel-based interpretation maps consistently improve the stability and quality over the pixel-based saliency maps.

Paper Structure

This paper contains 12 sections, 5 theorems, 5 equations, 5 figures, 1 table.

Key Result

Theorem 1

Suppose an interpretation scheme $I$ is $\epsilon$-uniformly stable. Then, the following generalization bound holds: $\epsilon_{\text{gen}} (I) \, \leq\, \epsilon.$

Figures (5)

  • Figure 1: Similarity between saliency maps from two separately trained neural nets: the SSIM between the pixel-based maps is significantly lower than the super-pixel-based maps.
  • Figure 2: Left: The SSIM of saliency map between two models trained with disjoint training datasets or different initialization from CIFAR10 (top) and Imagenet (bottom). Middle: Comparison of SSIM with region-based interpretation. Right: Comparison of MeGe between pixel-based and super-pixel-based methods. $L_2$-norm is used as the distance measure.
  • Figure 3: Qualitative comparison on ImageNet dataset between pixel-based and super-pixel-based interpretation maps for different gradient-based methods.
  • Figure 4: Visualization of super-pixel-based simple gradient maps with different numbers of groups in Quickshift (Top) and different super-pixel algorithms (Bottom).
  • Figure 5: Left: Tradeoff curve between SSIM and $\mu$Fidelity. The size of the dots reflects the size of super-pixels. Middle: Comparison of ROAD (top) and ROAR (bottom) between pixel-based and super-pixel-based methods. Right: $L_2$-norm difference between super-pixel-based SimpleGrad / SmoothGrad and standard SimpleGrad.

Theorems & Definitions (8)

  • Definition 1
  • Definition 2: Interpretation Generalization Error
  • Definition 3: $\epsilon$-uniformly stable
  • Theorem 1
  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Corollary 1