Distillation-Supervised Convolutional Low-Rank Adaptation for Efficient Image Super-Resolution
Xinning Chai, Yao Zhang, Yuxuan Zhang, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Li Song
TL;DR
This work tackles efficient single-image super-resolution by applying ConvLoRA to a lightweight backbone (SPAN) and enhancing fine-tuning with knowledge distillation that preserves second-order feature statistics. It introduces the SConvLB module to integrate low-rank adapters into the SPAB block and extends ConvLoRA to the pixel shuffle stage, enabling performance gains without increasing inference costs. A hybrid distillation scheme, combining spatial affinity loss and pixel-level and reconstruction losses, guides the student model to capture critical textures and structures ($L_{total} = \lambda_1 L_{rec} + \lambda_2 L_{TS} + \lambda_3 L_{AD}$). Empirically, DSCLoRA achieves consistent PSNR/SSIM gains over SPAN across multiple benchmarks, with the DSCLoRA-L variant delivering top performance while remaining lightweight, and it ranked first in the NTIRE 2025 Efficient SR Challenge, showcasing practical impact for real-time SR tasks.
Abstract
Convolutional neural networks (CNNs) have been widely used in efficient image super-resolution. However, for CNN-based methods, performance gains often require deeper networks and larger feature maps, which increase complexity and inference costs. Inspired by LoRA's success in fine-tuning large language models, we explore its application to lightweight models and propose Distillation-Supervised Convolutional Low-Rank Adaptation (DSCLoRA), which improves model performance without increasing architectural complexity or inference costs. Specifically, we integrate ConvLoRA into the efficient SR network SPAN by replacing the SPAB module with the proposed SConvLB module and incorporating ConvLoRA layers into both the pixel shuffle block and its preceding convolutional layer. DSCLoRA leverages low-rank decomposition for parameter updates and employs a spatial feature affinity-based knowledge distillation strategy to transfer second-order statistical information from teacher models (pre-trained SPAN) to student models (ours). This method preserves the core knowledge of lightweight models and facilitates optimal solution discovery under certain conditions. Experiments on benchmark datasets show that DSCLoRA improves PSNR and SSIM over SPAN while maintaining its efficiency and competitive image quality. Notably, DSCLoRA ranked first in the Overall Performance Track of the NTIRE 2025 Efficient Super-Resolution Challenge. Our code and models are made publicly available at https://github.com/Yaozzz666/DSCF-SR.
