Dual-Model Weight Selection and Self-Knowledge Distillation for Medical Image Classification
Ayaka Tsutsumi, Guang Li, Ren Togo, Takahiro Ogawa, Satoshi Kondo, Miki Haseyama
TL;DR
This paper tackles the challenge of deploying accurate medical image classifiers under tight computational constraints. It introduces a dual-model weight selection strategy that initializes two lightweight models from a large pretrained teacher, combined with self-knowledge distillation using an EMA-based auxiliary teacher to refine learning without extra cost. Across chest X-ray, lung CT, and brain MRI datasets, the approach yields consistent accuracy gains, particularly in data-scarce scenarios, while maintaining efficiency. The work offers a practical pathway to robust, resource-efficient medical imaging systems suitable for real-world clinical deployment.
Abstract
We propose a novel medical image classification method that integrates dual-model weight selection with self-knowledge distillation (SKD). In real-world medical settings, deploying large-scale models is often limited by computational resource constraints, which pose significant challenges for their practical implementation. Thus, developing lightweight models that achieve comparable performance to large-scale models while maintaining computational efficiency is crucial. To address this, we employ a dual-model weight selection strategy that initializes two lightweight models with weights derived from a large pretrained model, enabling effective knowledge transfer. Next, SKD is applied to these selected models, allowing the use of a broad range of initial weight configurations without imposing additional excessive computational cost, followed by fine-tuning for the target classification tasks. By combining dual-model weight selection with self-knowledge distillation, our method overcomes the limitations of conventional approaches, which often fail to retain critical information in compact models. Extensive experiments on publicly available datasets-chest X-ray images, lung computed tomography scans, and brain magnetic resonance imaging scans-demonstrate the superior performance and robustness of our approach compared to existing methods.
