TraNCE: Transformative Non-linear Concept Explainer for CNNs
Ugochukwu Ejike Akpudo, Yongsheng Gao, Jun Zhou, Andrew Lewis
TL;DR
TraNCE tackles CNN explainability by automatically discovering non-linear concepts from intermediate activations using a VAE-based reducer, paired with a Bessel-function visualization and a Faith score that jointly accounts for Coherence and Fidelity. It demonstrates superior global faithfulness and meaningful concept prototypes on FGVC tasks, outperforming several baselines and offering robust local and global explanations. The method emphasizes human-friendly interpretation, transferability of trained explainers, and resilience to certain image transformations, while acknowledging limitations in high inter-class similarity and computational costs. Overall, TraNCE advances quantitative, human-centric XAI for CNNs and opens avenues for extensions to video data and transformer architectures.
Abstract
Convolutional neural networks (CNNs) have succeeded remarkably in various computer vision tasks. However, they are not intrinsically explainable. While the feature-level understanding of CNNs reveals where the models looked, concept-based explainability methods provide insights into what the models saw. However, their assumption of linear reconstructability of image activations fails to capture the intricate relationships within these activations. Their Fidelity-only approach to evaluating global explanations also presents a new concern. For the first time, we address these limitations with the novel Transformative Nonlinear Concept Explainer (TraNCE) for CNNs. Unlike linear reconstruction assumptions made by existing methods, TraNCE captures the intricate relationships within the activations. This study presents three original contributions to the CNN explainability literature: (i) An automatic concept discovery mechanism based on variational autoencoders (VAEs). This transformative concept discovery process enhances the identification of meaningful concepts from image activations. (ii) A visualization module that leverages the Bessel function to create a smooth transition between prototypical image pixels, revealing not only what the CNN saw but also what the CNN avoided, thereby mitigating the challenges of concept duplication as documented in previous works. (iii) A new metric, the Faith score, integrates both Coherence and Fidelity for a comprehensive evaluation of explainer faithfulness and consistency.
