Rethinking Intermediate Layers design in Knowledge Distillation for Kidney and Liver Tumor Segmentation
Vandan Gorade, Sparsh Mittal, Debesh Jha, Ulas Bagci
TL;DR
The paper tackles the challenge of applying knowledge distillation to kidney and liver tumor segmentation by introducing Hierarchical Layer-selective Feedback Distillation (HLFD), which blends feature- and pixel-level guidance across multiple teacher-to-student layer mappings. HLFD decomposes distillation into FLFD and PLFD, incorporating unified and individual transfers with multi-task losses to encourage early-layer quality representations and accurate pixel-level predictions. Across KiTS and LiTS datasets, HLFD outperforms non-KD students and existing KD methods, achieving notable Dice score gains and improved volume accuracy, while qualitative analysis confirms sharper tumor focus and reduced irrelevant information. This approach yields a robust, compact student model with practical potential for efficient, accurate clinical diagnostics, and code is publicly available for replication.
Abstract
Knowledge distillation (KD) has demonstrated remarkable success across various domains, but its application to medical imaging tasks, such as kidney and liver tumor segmentation, has encountered challenges. Many existing KD methods are not specifically tailored for these tasks. Moreover, prevalent KD methods often lack a careful consideration of `what' and `from where' to distill knowledge from the teacher to the student. This oversight may lead to issues like the accumulation of training bias within shallower student layers, potentially compromising the effectiveness of KD. To address these challenges, we propose Hierarchical Layer-selective Feedback Distillation (HLFD). HLFD strategically distills knowledge from a combination of middle layers to earlier layers and transfers final layer knowledge to intermediate layers at both the feature and pixel levels. This design allows the model to learn higher-quality representations from earlier layers, resulting in a robust and compact student model. Extensive quantitative evaluations reveal that HLFD outperforms existing methods by a significant margin. For example, in the kidney segmentation task, HLFD surpasses the student model (without KD) by over 10\%, significantly improving its focus on tumor-specific features. From a qualitative standpoint, the student model trained using HLFD excels at suppressing irrelevant information and can focus sharply on tumor-specific details, which opens a new pathway for more efficient and accurate diagnostic tools. Code is available \href{https://github.com/vangorade/RethinkingKD_ISBI24}{here}.
