Table of Contents
Fetching ...

Enhancing Explainable AI: A Hybrid Approach Combining GradCAM and LRP for CNN Interpretability

Vaibhav Dhore, Achintya Bhat, Viraj Nerlekar, Kashyap Chavhan, Aniket Umare

TL;DR

The paper addresses the challenge of CNN interpretability by combining GradCAM and LRP to produce clearer visual explanations. It introduces a pipeline that denoises GradCAM, multiplies it with channel-averaged LRP outputs, and applies Gaussian smoothing, leveraging GradCAM++ and LRP configurations via Captum in PyTorch. The approach yields a method that excels in Complexity while maintaining competitive performance on Faithfulness, Robustness, Localization, and Randomisation, validated through qualitative visuals and quantitative metrics across multiple benchmarks. This fusion enhances the reliability and readability of explanations, supporting greater trust and applicability of CNNs in high-stakes domains.

Abstract

We present a new technique that explains the output of a CNN-based model using a combination of GradCAM and LRP methods. Both of these methods produce visual explanations by highlighting input regions that are important for predictions. In the new method, the explanation produced by GradCAM is first processed to remove noises. The processed output is then multiplied elementwise with the output of LRP. Finally, a Gaussian blur is applied on the product. We compared the proposed method with GradCAM and LRP on the metrics of Faithfulness, Robustness, Complexity, Localisation and Randomisation. It was observed that this method performs better on Complexity than both GradCAM and LRP and is better than atleast one of them in the other metrics.

Enhancing Explainable AI: A Hybrid Approach Combining GradCAM and LRP for CNN Interpretability

TL;DR

The paper addresses the challenge of CNN interpretability by combining GradCAM and LRP to produce clearer visual explanations. It introduces a pipeline that denoises GradCAM, multiplies it with channel-averaged LRP outputs, and applies Gaussian smoothing, leveraging GradCAM++ and LRP configurations via Captum in PyTorch. The approach yields a method that excels in Complexity while maintaining competitive performance on Faithfulness, Robustness, Localization, and Randomisation, validated through qualitative visuals and quantitative metrics across multiple benchmarks. This fusion enhances the reliability and readability of explanations, supporting greater trust and applicability of CNNs in high-stakes domains.

Abstract

We present a new technique that explains the output of a CNN-based model using a combination of GradCAM and LRP methods. Both of these methods produce visual explanations by highlighting input regions that are important for predictions. In the new method, the explanation produced by GradCAM is first processed to remove noises. The processed output is then multiplied elementwise with the output of LRP. Finally, a Gaussian blur is applied on the product. We compared the proposed method with GradCAM and LRP on the metrics of Faithfulness, Robustness, Complexity, Localisation and Randomisation. It was observed that this method performs better on Complexity than both GradCAM and LRP and is better than atleast one of them in the other metrics.
Paper Structure (22 sections, 7 figures, 1 table)

This paper contains 22 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: An illustration of the proposed method. First, the GradCAM and LRP output is obtained for the given input image and class. The GradCAM output is processed to remove so that values below a certain threshold become zero. LRP output is averaged across the three color channels. These processed outputs are then multiplied elementwise. Finally, the product is smoothed using a Gaussian filter.
  • Figure 2: An example of the working of the proposed method.
  • Figure 3: The results of the combination of GradCAM with LRP and other visualization methods. Among these, only the combination of GradCAM with LRP produced comprehensible results, while the other combinations yielded unclear outcomes.
  • Figure 4: Examples of explanations generated for inputs of various classes on a VGG-16 model trained on imagenet dataset.
  • Figure 5: Further examples of explanations generated for inputs of various classes on a VGG-16 model trained on imagenet dataset.
  • ...and 2 more figures