Unpaired Photo-realistic Image Deraining with Energy-informed Diffusion Model
Yuanbo Wen, Tao Gao, Ting Chen
TL;DR
This work tackles the challenge of unpaired photo-realistic image deraining by introducing UPID-EDM, a diffusion-based framework guided by a dual-consistent energy function. The energy function decomposes into rain-relevance discarding and rain-irrelevance preserving components, informed by learnable domain-representation prompts that exploit CLIP priors. By updating the diffusion score with the gradient of the energy term, the model performs reverse sampling from rainy inputs using a clean-domain diffusion model, yielding high-fidelity, natural derained images without paired data. Empirical results on five benchmarks show state-of-the-art performance in both supervised and no-reference metrics, while ablations validate the contributions of the energy functions, prompts, and starting-time choices. The approach highlights the potential of combining energy guidance with diffusion models for challenging unpaired restoration tasks, albeit with notable computational demands and occasional hallucinations in extreme rain scenarios.
Abstract
Existing unpaired image deraining approaches face challenges in accurately capture the distinguishing characteristics between the rainy and clean domains, resulting in residual degradation and color distortion within the reconstructed images. To this end, we propose an energy-informed diffusion model for unpaired photo-realistic image deraining (UPID-EDM). Initially, we delve into the intricate visual-language priors embedded within the contrastive language-image pre-training model (CLIP), and demonstrate that the CLIP priors aid in the discrimination of rainy and clean images. Furthermore, we introduce a dual-consistent energy function (DEF) that retains the rain-irrelevant characteristics while eliminating the rain-relevant features. This energy function is trained by the non-corresponding rainy and clean images. In addition, we employ the rain-relevance discarding energy function (RDEF) and the rain-irrelevance preserving energy function (RPEF) to direct the reverse sampling procedure of a pre-trained diffusion model, effectively removing the rain streaks while preserving the image contents. Extensive experiments demonstrate that our energy-informed model surpasses the existing unpaired learning approaches in terms of both supervised and no-reference metrics.
