Table of Contents
Fetching ...

Empowering CAM-Based Methods with Capability to Generate Fine-Grained and High-Faithfulness Explanations

Changqing Qiu, Fusheng Jin, Yining Zhang

TL;DR

FG-CAM addresses the limitation of CAM-based explanations being coarse at shallow layers and LRP's relatively low faithfulness by enabling fine-grained, high-faithfulness explanations without altering CAM principles. It progressively refines explanations by exploiting the relationship between adjacent feature-map layers with different resolutions to identify contributing pixels while filtering non-contributors, and it includes a denoising variant to reduce noise with negligible faithfulness loss. Empirical results show FG-CAM achieves state-of-the-art performance in shallow and intermediate layers and outperforms LRP variants at the input layer, with explanations largely invariant to resolution. The work provides publicly available code and demonstrates broad applicability across CNN architectures.

Abstract

Recently, the explanation of neural network models has garnered considerable research attention. In computer vision, CAM (Class Activation Map)-based methods and LRP (Layer-wise Relevance Propagation) method are two common explanation methods. However, since most CAM-based methods can only generate global weights, they can only generate coarse-grained explanations at a deep layer. LRP and its variants, on the other hand, can generate fine-grained explanations. But the faithfulness of the explanations is too low. To address these challenges, in this paper, we propose FG-CAM (Fine-Grained CAM), which extends CAM-based methods to enable generating fine-grained and high-faithfulness explanations. FG-CAM uses the relationship between two adjacent layers of feature maps with resolution differences to gradually increase the explanation resolution, while finding the contributing pixels and filtering out the pixels that do not contribute. Our method not only solves the shortcoming of CAM-based methods without changing their characteristics, but also generates fine-grained explanations that have higher faithfulness than LRP and its variants. We also present FG-CAM with denoising, which is a variant of FG-CAM and is able to generate less noisy explanations with almost no change in explanation faithfulness. Experimental results show that the performance of FG-CAM is almost unaffected by the explanation resolution. FG-CAM outperforms existing CAM-based methods significantly in both shallow and intermediate layers, and outperforms LRP and its variants significantly in the input layer. Our code is available at https://github.com/dongmo-qcq/FG-CAM.

Empowering CAM-Based Methods with Capability to Generate Fine-Grained and High-Faithfulness Explanations

TL;DR

FG-CAM addresses the limitation of CAM-based explanations being coarse at shallow layers and LRP's relatively low faithfulness by enabling fine-grained, high-faithfulness explanations without altering CAM principles. It progressively refines explanations by exploiting the relationship between adjacent feature-map layers with different resolutions to identify contributing pixels while filtering non-contributors, and it includes a denoising variant to reduce noise with negligible faithfulness loss. Empirical results show FG-CAM achieves state-of-the-art performance in shallow and intermediate layers and outperforms LRP variants at the input layer, with explanations largely invariant to resolution. The work provides publicly available code and demonstrates broad applicability across CNN architectures.

Abstract

Recently, the explanation of neural network models has garnered considerable research attention. In computer vision, CAM (Class Activation Map)-based methods and LRP (Layer-wise Relevance Propagation) method are two common explanation methods. However, since most CAM-based methods can only generate global weights, they can only generate coarse-grained explanations at a deep layer. LRP and its variants, on the other hand, can generate fine-grained explanations. But the faithfulness of the explanations is too low. To address these challenges, in this paper, we propose FG-CAM (Fine-Grained CAM), which extends CAM-based methods to enable generating fine-grained and high-faithfulness explanations. FG-CAM uses the relationship between two adjacent layers of feature maps with resolution differences to gradually increase the explanation resolution, while finding the contributing pixels and filtering out the pixels that do not contribute. Our method not only solves the shortcoming of CAM-based methods without changing their characteristics, but also generates fine-grained explanations that have higher faithfulness than LRP and its variants. We also present FG-CAM with denoising, which is a variant of FG-CAM and is able to generate less noisy explanations with almost no change in explanation faithfulness. Experimental results show that the performance of FG-CAM is almost unaffected by the explanation resolution. FG-CAM outperforms existing CAM-based methods significantly in both shallow and intermediate layers, and outperforms LRP and its variants significantly in the input layer. Our code is available at https://github.com/dongmo-qcq/FG-CAM.
Paper Structure (4 sections, 1 equation, 2 figures)

This paper contains 4 sections, 1 equation, 2 figures.

Figures (2)

  • Figure 1: Visualization results of FG-CAM and other CAM-based methods on VGG-16 with batch normalization. Layer43 represents model.features[43]. Pixels with higher brightness are more important.
  • Figure 2: FG-CAM pipeline. In step 1, the explanation components are calculated using a CAM-based method in last convolutional layer. In step 2, improve resolution of the explanation components. Finally, in step 3, generate a fine-grained explanation.