On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Process
Gereziher Adhane, Mohammad Mahdi Dehshibi, Dennis Vetter, David Masip, Gemma Roig
TL;DR
This paper addresses the opacity of knowledge transfer in knowledge distillation by separating teacher-derived knowledge from student-learned representations. It introduces UniCAM, a gradient-based visual explanation that isolates distilled and residual features, and two quantitative metrics, Feature Similarity Score ($FSS$) and Relevance Score ($RS$), to assess alignment and task relevance using distance-based correlations. Across CIFAR-10, ASIRRA, and Plant Disease datasets, UniCAM reveals that KD guides the Student toward more task-relevant features while discarding irrelevant ones, with higher RS indicating meaningful transfer. The capacity-gap analyses and the use of a Teacher assistant demonstrate practical guidance for selecting KD pairs and improving transfer efficiency, highlighting the method's potential to inform KD design and deployment.
Abstract
Knowledge distillation (KD) remains challenging due to the opaque nature of the knowledge transfer process from a Teacher to a Student, making it difficult to address certain issues related to KD. To address this, we proposed UniCAM, a novel gradient-based visual explanation method, which effectively interprets the knowledge learned during KD. Our experimental results demonstrate that with the guidance of the Teacher's knowledge, the Student model becomes more efficient, learning more relevant features while discarding those that are not relevant. We refer to the features learned with the Teacher's guidance as distilled features and the features irrelevant to the task and ignored by the Student as residual features. Distilled features focus on key aspects of the input, such as textures and parts of objects. In contrast, residual features demonstrate more diffused attention, often targeting irrelevant areas, including the backgrounds of the target objects. In addition, we proposed two novel metrics: the feature similarity score (FSS) and the relevance score (RS), which quantify the relevance of the distilled knowledge. Experiments on the CIFAR10, ASIRRA, and Plant Disease datasets demonstrate that UniCAM and the two metrics offer valuable insights to explain the KD process.
