FideDiff: Efficient Diffusion Model for High-Fidelity Image Motion Deblurring
Xiaoyang Liu, Zhengyan Zhou, Zihang Xu, Jiezhang Cao, Zheng Chen, Yulun Zhang
TL;DR
FideDiff tackles the inefficiency and fidelity challenges of pretrained diffusion models in image motion deblurring by reframing blur as a diffusion-like trajectory and enforcing cross-time consistency so a single-step denoising can recover the clean image. It introduces a time-consistent training objective, a high-fidelity foundation model based on Stable Diffusion, and Kernel ControlNet to inject blur-kernel conditioning and adaptive timestep prediction. A three-stage training pipeline (foundation model, kernel estimator pretraining, and joint kernel-control training with reblur/time losses) yields superior full-reference metrics and competitive perceptual quality, with substantial speed advantages over multi-step DMs. Evaluations on GoPro, HIDE, and RealBlur demonstrate strong fidelity, robustness to real-world blur, and practical inference speed, making diffusion-based restoration more viable for industrial applications. The work establishes a robust baseline for high-fidelity, rapid DM-based restoration and points to scalable directions for deploying diffusion priors in low-level vision tasks.
Abstract
Recent advancements in image motion deblurring, driven by CNNs and transformers, have made significant progress. Large-scale pre-trained diffusion models, which are rich in true-world modeling, have shown great promise for high-quality image restoration tasks such as deblurring, demonstrating stronger generative capabilities than CNN and transformer-based methods. However, challenges such as unbearable inference time and compromised fidelity still limit the full potential of the diffusion models. To address this, we introduce FideDiff, a novel single-step diffusion model designed for high-fidelity deblurring. We reformulate motion deblurring as a diffusion-like process where each timestep represents a progressively blurred image, and we train a consistency model that aligns all timesteps to the same clean image. By reconstructing training data with matched blur trajectories, the model learns temporal consistency, enabling accurate one-step deblurring. We further enhance model performance by integrating Kernel ControlNet for blur kernel estimation and introducing adaptive timestep prediction. Our model achieves superior performance on full-reference metrics, surpassing previous diffusion-based methods and matching the performance of other state-of-the-art models. FideDiff offers a new direction for applying pre-trained diffusion models to high-fidelity image restoration tasks, establishing a robust baseline for further advancing diffusion models in real-world industrial applications. Our dataset and code will be available at https://github.com/xyLiu339/FideDiff.
