Quantum Gradient Class Activation Map for Model Interpretability

Hsin-Yi Lin; Huan-Hsin Tseng; Samuel Yen-Chi Chen; Shinjae Yoo

Quantum Gradient Class Activation Map for Model Interpretability

Hsin-Yi Lin, Huan-Hsin Tseng, Samuel Yen-Chi Chen, Shinjae Yoo

TL;DR

This work addresses interpretability in quantum machine learning by introducing QGrad-CAM, a framework that uses a Variational Quantum Circuit to assign importance to CNN activation maps through gradient-based localization. It derives an explicit activation-map weighting formula $w_k^{\ell} = \frac{1}{WH}\sum_{i,j} \frac{\partial f^{\ell}(A_{ij}^k)}{\partial A_{ij}^k}$ and outlines its derivation via density-matrix expansion and Lie brackets, enabling Grad-CAM-style explanations for quantum components. The method is validated on image datasets (MNIST, Dogs vs Cats) and a speech corpus (TIMIT), producing class-specific localization maps and, in speech, revealing when the model attends to background regions to detect noise. The results suggest that quantum-classical hybrids can offer transparent, computable explanations and motivate future work on leveraging quantum techniques for interpretability.

Abstract

Quantum machine learning (QML) has recently made significant advancements in various topics. Despite the successes, the safety and interpretability of QML applications have not been thoroughly investigated. This work proposes using Variational Quantum Circuits (VQCs) for activation mapping to enhance model transparency, introducing the Quantum Gradient Class Activation Map (QGrad-CAM). This hybrid quantum-classical computing framework leverages both quantum and classical strengths and gives access to the derivation of an explicit formula of feature map importance. Experimental results demonstrate significant, fine-grained, class-discriminative visual explanations generated across both image and speech datasets.

Quantum Gradient Class Activation Map for Model Interpretability

TL;DR

and outlines its derivation via density-matrix expansion and Lie brackets, enabling Grad-CAM-style explanations for quantum components. The method is validated on image datasets (MNIST, Dogs vs Cats) and a speech corpus (TIMIT), producing class-specific localization maps and, in speech, revealing when the model attends to background regions to detect noise. The results suggest that quantum-classical hybrids can offer transparent, computable explanations and motivate future work on leveraging quantum techniques for interpretability.

Abstract

Paper Structure (9 sections, 17 equations, 7 figures)

This paper contains 9 sections, 17 equations, 7 figures.

Introduction
Related work
Quantum gradient class-activation map (QGrad-CAM)
Variational Quantum Circuits
Regularities of VQC compared to neural networks
Quantum Grad-CAM by VQC
Explainability by Quantum Grad-CAM
Experiment
Conclusion

Figures (7)

Figure 1: The diagram of a VQC with a classical input $x_j \in \mathbb{R}^n$ received.
Figure 2: A circuit representation of the VQC used in this work, Eq. (\ref{['E: variational U']}). The dashed-line box is repeated $k$ times to increase the depth of the circuit.
Figure 3: The workflow of Quantum Grad-CAM.
Figure 4: MNIST. The generated CAM heatmaps highlight the regions with unique and distinguishable shapes and contours for each digit. For example, the heatmap for the digit '7' emphasizes the sharp turn, while the heatmap for '6' focuses on the closed loop.
Figure 5: Dogs vs. Cats. The higher resolution and colored images allow the detection of fine-grained textures and contours, which are crucial for identifying unique features to each animal.
...and 2 more figures

Quantum Gradient Class Activation Map for Model Interpretability

TL;DR

Abstract

Quantum Gradient Class Activation Map for Model Interpretability

Authors

TL;DR

Abstract

Table of Contents

Figures (7)