Table of Contents
Fetching ...

Strengthening Interpretability: An Investigative Study of Integrated Gradient Methods

Shree Singhi, Anupriya Kumari

TL;DR

The paper conducts a thorough reproducibility study of Integrated Gradients (IG) and the Important Direction Gradient Integration (IDGI) framework, addressing theoretical claims and empirical performance. It provides a rigorous theoretical analysis, including a Taylor-series derivation of key quantities such as x_{j_p}, and demonstrates that IDGI can be more sensitive to the number of Riemann-sum steps than the underlying IG methods. Through extensive experiments on Imagenet with multiple models and metrics (Insertion Score, SIC, AIC, and MS-SSIM variants), the authors show that IDGI generally improves attribution quality for several baselines but not universally; certain architectures (e.g., residual networks) exhibit anomalies, and step-size significantly influences performance. The study also reveals that IDGI tends to enhance numerical stability and offers an implemented codebase to facilitate reproducibility, contributing valuable guidance for practitioners applying attribution methods in vision models.

Abstract

We conducted a reproducibility study on Integrated Gradients (IG) based methods and the Important Direction Gradient Integration (IDGI) framework. IDGI eliminates the explanation noise in each step of the computation of IG-based methods that use the Riemann Integration for integrated gradient computation. We perform a rigorous theoretical analysis of IDGI and raise a few critical questions that we later address through our study. We also experimentally verify the authors' claims concerning the performance of IDGI over IG-based methods. Additionally, we varied the number of steps used in the Riemann approximation, an essential parameter in all IG methods, and analyzed the corresponding change in results. We also studied the numerical instability of the attribution methods to check the consistency of the saliency maps produced. We developed the complete code to implement IDGI over the baseline IG methods and evaluated them using three metrics since the available code was insufficient for this study.

Strengthening Interpretability: An Investigative Study of Integrated Gradient Methods

TL;DR

The paper conducts a thorough reproducibility study of Integrated Gradients (IG) and the Important Direction Gradient Integration (IDGI) framework, addressing theoretical claims and empirical performance. It provides a rigorous theoretical analysis, including a Taylor-series derivation of key quantities such as x_{j_p}, and demonstrates that IDGI can be more sensitive to the number of Riemann-sum steps than the underlying IG methods. Through extensive experiments on Imagenet with multiple models and metrics (Insertion Score, SIC, AIC, and MS-SSIM variants), the authors show that IDGI generally improves attribution quality for several baselines but not universally; certain architectures (e.g., residual networks) exhibit anomalies, and step-size significantly influences performance. The study also reveals that IDGI tends to enhance numerical stability and offers an implemented codebase to facilitate reproducibility, contributing valuable guidance for practitioners applying attribution methods in vision models.

Abstract

We conducted a reproducibility study on Integrated Gradients (IG) based methods and the Important Direction Gradient Integration (IDGI) framework. IDGI eliminates the explanation noise in each step of the computation of IG-based methods that use the Riemann Integration for integrated gradient computation. We perform a rigorous theoretical analysis of IDGI and raise a few critical questions that we later address through our study. We also experimentally verify the authors' claims concerning the performance of IDGI over IG-based methods. Additionally, we varied the number of steps used in the Riemann approximation, an essential parameter in all IG methods, and analyzed the corresponding change in results. We also studied the numerical instability of the attribution methods to check the consistency of the saliency maps produced. We developed the complete code to implement IDGI over the baseline IG methods and evaluated them using three metrics since the available code was insufficient for this study.
Paper Structure (16 sections, 8 equations, 5 figures, 17 tables, 1 algorithm)

This paper contains 16 sections, 8 equations, 5 figures, 17 tables, 1 algorithm.

Figures (5)

  • Figure 1: Saliency maps of the existing IG-based methods and those with IDGI explaining the prediction from all models. While we cannot make any comments about IDGI's results being objectively better or worse for this instance, one can see that IDGI's saliency maps tend to be more patch-like and do not highlight edges of the input image as observed without IDGI.
  • Figure 2: The original illustration of IDGI b8 (left), our improved illustration(right).
  • Figure 3: Saliency map of the existing IG-based methods and those with IDGI explaining the prediction from InceptionV3. We compare two sets of saliencies for each image by taking the model's top 2 distinct classes as the predicted class. While the top class object is always more highlighted, we observe that all IG-based methods with IDGI are slightly better at highlighting each class than methods without IDGI.
  • Figure 4: Insertion Score with probability and probability ratio, AIC and SIC using Normalized Entropy and MS-SSIM vs. number of steps, for Inceptionv3.
  • Figure 5: Images observed along the path of BlurIG and IG. For BlurIG, for most values of $\alpha$, we notice minimal changes in the image with small increments. In contrast, for IG, a uniform change in the image is observed with the same increments in $\alpha$.