A Knowledge Distillation-Based Approach to Enhance Transparency of Classifier Models
Yuchen Jiang, Xinyuan Zhao, Yihang Wu, Ahmad Chaddad
TL;DR
This work tackles the need for explainable AI in medical image analysis by introducing a KD-based framework that distills a DenseNet121 teacher into a shallow five-layer student. It uses KD-FMV, combining a hard loss $\ell_{HL}$ and a soft loss $\ell_{SL}$ with temperature $T$ to transfer feature representations via $\mathcal{L}_{distill}=\alpha \cdot \ell_{HL} + (1-\alpha) \ell_{SL}$, while leveraging average feature maps to visualize per-layer decision processes. Grad-CAM and SHAP are employed to validate interpretability, and the approach is evaluated on brain tumor, eye disease, and Alzheimer's datasets, with the student achieving near-teacher accuracy (and sometimes surpassing it) while reducing model depth. Additionally, the method reduces FLOPs and mean execution time, enabling faster and more efficient interpretability suitable for resource-limited clinical settings.
Abstract
With the rapid development of artificial intelligence (AI), especially in the medical field, the need for its explainability has grown. In medical image analysis, a high degree of transparency and model interpretability can help clinicians better understand and trust the decision-making process of AI models. In this study, we propose a Knowledge Distillation (KD)-based approach that aims to enhance the transparency of the AI model in medical image analysis. The initial step is to use traditional CNN to obtain a teacher model and then use KD to simplify the CNN architecture, retain most of the features of the data set, and reduce the number of network layers. It also uses the feature map of the student model to perform hierarchical analysis to identify key features and decision-making processes. This leads to intuitive visual explanations. We selected three public medical data sets (brain tumor, eye disease, and Alzheimer's disease) to test our method. It shows that even when the number of layers is reduced, our model provides a remarkable result in the test set and reduces the time required for the interpretability analysis.
