Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs
Ruigang Fu, Qingyong Hu, Xiaohu Dong, Yulan Guo, Yinghui Gao, Biao Li
TL;DR
This work addresses the lack of theoretical grounding in CAM-based CNN visualizations by introducing two axioms, Sensitivity and Conservation, and proposing XGrad-CAM, an axiom-informed visualization that generalizes to arbitrary CNNs. It derives an approximate, gradient-weighted scheme for feature maps and provides a Guided variant for richer details, demonstrating improved alignment with the axioms. Through experiments on VGG-16 and multiple benchmarks, XGrad-CAM shows enhanced axiom satisfaction and competitive localization and class-discrimination, while offering substantial efficiency advantages over Ablation-CAM. The approach offers a principled framework for interpreting CNN decisions with practical impact for visualization and debugging.
Abstract
To have a better understanding and usage of Convolution Neural Networks (CNNs), the visualization and interpretation of CNNs has attracted increasing attention in recent years. In particular, several Class Activation Mapping (CAM) methods have been proposed to discover the connection between CNN's decision and image regions. In spite of the reasonable visualization, lack of clear and sufficient theoretical support is the main limitation of these methods. In this paper, we introduce two axioms -- Conservation and Sensitivity -- to the visualization paradigm of the CAM methods. Meanwhile, a dedicated Axiom-based Grad-CAM (XGrad-CAM) is proposed to satisfy these axioms as much as possible. Experiments demonstrate that XGrad-CAM is an enhanced version of Grad-CAM in terms of conservation and sensitivity. It is able to achieve better visualization performance than Grad-CAM, while also be class-discriminative and easy-to-implement compared with Grad-CAM++ and Ablation-CAM. The code is available at https://github.com/Fu0511/XGrad-CAM.
