Efficient Image Deblurring Networks based on Diffusion Models
Kang Chen, Yuanjie Liu
TL;DR
This work tackles high-resolution defocus and motion deblurring under memory constraints by introducing Swintormer, a Sliding Window Transformer that integrates diffusion-model latent priors to guide restoration. It achieves linear-complexity attention via Shifted Windows-Dconv Attention and leverages a latent diffusion model (LDM) to generate prior features $z_0$ in latent space, fed into a diffusion-guided deblurring pipeline trained with both $L_1$ and perceptual losses. The method delivers state-of-the-art results on RealDOF, DPDD, and GoPro datasets, while dramatically reducing per-iteration MACs from 140.35 GMACs to 8.02 GMACs, enabling high-quality deblurring on memory-limited devices. An ablation study confirms the efficacy of diffusion priors, windowing strategies, and a training/inference consistency pre-processing scheme, and code is released for reproducibility at the provided repository.
Abstract
This article presents a sliding window model for defocus deblurring, named Swintormer, which achieves the best performance to date with remarkably low memory usage. This method utilizes a diffusion model to generate latent prior features, aiding in the restoration of more detailed images. Additionally, by adapting the sliding window strategy, it incorporates specialized Transformer blocks to enhance inference efficiency. The adoption of this new approach has led to a substantial reduction in Multiply-Accumulate Operations (MACs) per iteration, drastically cutting down memory requirements. In comparison to the currently leading GRL method, our Swintormer model significantly reduces the computational load that must depend on memory capacity, from 140.35 GMACs to 8.02 GMACs, while improving the Peak Signal-to-Noise Ratio (PSNR) for defocus deblurring from 27.04 dB to 27.07 dB. This innovative technique enables the processing of higher resolution images on memory-limited devices, vastly broadening potential application scenarios. The article wraps up with an ablation study, offering a comprehensive examination of how each network module contributes to the final performance.The source code and model will be available at the following website: https://github.com/bnm6900030/swintormer.
