CF-CAM: Cluster Filter Class Activation Mapping for Reliable Gradient-Based Interpretability
Hongjie He, Xu Pan, Yudong Yao
TL;DR
The paper tackles the challenge of trustworthy interpretability in deep CNNs by addressing gradient-noise and efficiency concerns in visual explanations. It introduces CF-CAM, a two-stage framework that combines density-aware channel clustering (via DBSCAN) with cluster-conditioned gradient filtering and a hierarchical weighting scheme to produce faithful and edge-aware heatmaps. Experimental validation on the Shenzhen Hospital X-ray Set shows that CF-CAM achieves superior faithfulness and robustness compared with state-of-the-art CAM methods, while maintaining practical inference times. The work demonstrates CF-CAM’s potential for high-stakes applications such as medical diagnosis and autonomous driving, and suggests avenues for extending the approach to multi-modal models and broader architectures.
Abstract
As deep learning continues to advance, the transparency of neural network decision-making remains a critical challenge, limiting trust and applicability in high-stakes domains. Class Activation Mapping (CAM) techniques have emerged as a key approach toward visualizing model decisions, yet existing methods face inherent trade-offs. Gradient-based CAM variants suffer from sensitivity to gradient perturbations due to gradient noise, leading to unstable and unreliable explanations. Conversely, gradient-free approaches mitigate gradient instability but incur significant computational overhead and inference latency. To address these limitations, we propose a Cluster Filter Class Activation Map (CF-CAM) technique, a novel framework that reintroduces gradient-based weighting while enhancing robustness against gradient noise. CF-CAM utilizes hierarchical importance weighting strategy to balance discriminative feature preservation and noise elimination. A density-aware channel clustering method via Density-Based Spatial Clustering of Applications with Noise (DBSCAN) groups semantically relevant feature channels and discard noise-prone activations. Additionally, cluster-conditioned gradient filtering leverages Gaussian filters to refine gradient signals, preserving edge-aware localization while suppressing noise impact. Experiment results demonstrate that CF-CAM achieves superior interpretability performance while enhancing computational efficiency, outperforming state-of-the-art CAM methods in faithfulness and robustness. By effectively mitigating gradient instability without excessive computational cost, CF-CAM provides a competitive solution for enhancing the interpretability of deep neural networks in critical applications such as autonomous driving and medical diagnosis.
