WaveDM: Wavelet-Based Diffusion Models for Image Restoration
Yi Huang, Jiancheng Huang, Jianzhuang Liu, Mingfu Yan, Yu Dong, Jiaxi Lv, Chaoqi Chen, Shifeng Chen
TL;DR
This work addresses the slow inference of diffusion-based image restoration by transferring diffusion modeling to the wavelet domain, where input size is reduced and frequency content is separated. WaveDM uses a low-frequency diffusion process conditioned on degraded-wavelet spectra, complemented by a lightweight High Frequency Refinement Module, and employs Efficient Conditional Sampling to reduce the total steps to around $5$. The approach achieves state-of-the-art results on twelve restoration tasks while matching the speed of traditional one-pass methods and exceeding $100\times$ speedups over vanilla diffusion models. The method demonstrates strong generalization across diverse degradations and emphasizes practical impact by balancing restoration quality with computational efficiency, though it requires lengthy training and large-scale data for best performance.
Abstract
Latest diffusion-based methods for many image restoration tasks outperform traditional models, but they encounter the long-time inference problem. To tackle it, this paper proposes a Wavelet-Based Diffusion Model (WaveDM). WaveDM learns the distribution of clean images in the wavelet domain conditioned on the wavelet spectrum of degraded images after wavelet transform, which is more time-saving in each step of sampling than modeling in the spatial domain. To ensure restoration performance, a unique training strategy is proposed where the low-frequency and high-frequency spectrums are learned using distinct modules. In addition, an Efficient Conditional Sampling (ECS) strategy is developed from experiments, which reduces the number of total sampling steps to around 5. Evaluations on twelve benchmark datasets including image raindrop removal, rain steaks removal, dehazing, defocus deblurring, demoiréing, and denoising demonstrate that WaveDM achieves state-of-the-art performance with the efficiency that is comparable to traditional one-pass methods and over 100$\times$ faster than existing image restoration methods using vanilla diffusion models.
