Dynamic Contrastive Knowledge Distillation for Efficient Image Restoration
Yunshuai Zhou, Junbo Qiao, Jincheng Liao, Wei Li, Simiao Li, Jiao Xie, Yunhang Shen, Jie Hu, Shaohui Lin
TL;DR
Dynamic Contrastive Knowledge Distillation (DCKD) tackles the fixed solution-space limitation of prior KD for image restoration by introducing Dynamic Contrastive Regularization (DCR) and a Distribution Mapping Module (DMM). DCR creates a dynamic lower-bound guided by the student’s learning state via a degradation-based negative-sample generator and an EMA-updated history model, while DMM aligns pixel-level distributions using a VQGAN-based encoder and an explicit cross-entropy constraint. The framework is structure-agnostic and can be combined with upper-bound enhancements, achieving state-of-the-art gains across image super-resolution, deblurring, and deraining on multiple backbones, with consistent qualitative improvements in texture and detail. Together, these components enable efficient, high-quality restoration on resource-constrained devices by more effectively transferring knowledge from a heavy teacher to a compact student.
Abstract
Knowledge distillation (KD) is a valuable yet challenging approach that enhances a compact student network by learning from a high-performance but cumbersome teacher model. However, previous KD methods for image restoration overlook the state of the student during the distillation, adopting a fixed solution space that limits the capability of KD. Additionally, relying solely on L1-type loss struggles to leverage the distribution information of images. In this work, we propose a novel dynamic contrastive knowledge distillation (DCKD) framework for image restoration. Specifically, we introduce dynamic contrastive regularization to perceive the student's learning state and dynamically adjust the distilled solution space using contrastive learning. Additionally, we also propose a distribution mapping module to extract and align the pixel-level category distribution of the teacher and student models. Note that the proposed DCKD is a structure-agnostic distillation framework, which can adapt to different backbones and can be combined with methods that optimize upper-bound constraints to further enhance model performance. Extensive experiments demonstrate that DCKD significantly outperforms the state-of-the-art KD methods across various image restoration tasks and backbones.
