Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise

Zhenning Shi; Haoshuai Zheng; Chen Xu; Changsheng Dong; Bin Pan; Xueshuo Xie; Along He; Tao Li; Huazhu Fu

Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise

Zhenning Shi, Haoshuai Zheng, Chen Xu, Changsheng Dong, Bin Pan, Xueshuo Xie, Along He, Tao Li, Huazhu Fu

TL;DR

This work proposed Resfusion, a general framework that incorporates the residual term into the diffusion forward process, starting the reverse process directly from the noisy degraded images, and maintains the integrity of existing noise schedules, unifying the training and inference processes.

Abstract

Recently, research on denoising diffusion models has expanded its application to the field of image restoration. Traditional diffusion-based image restoration methods utilize degraded images as conditional input to effectively guide the reverse generation process, without modifying the original denoising diffusion process. However, since the degraded images already include low-frequency information, starting from Gaussian white noise will result in increased sampling steps. We propose Resfusion, a general framework that incorporates the residual term into the diffusion forward process, starting the reverse process directly from the noisy degraded images. The form of our inference process is consistent with the DDPM. We introduced a weighted residual noise, named resnoise, as the prediction target and explicitly provide the quantitative relationship between the residual term and the noise term in resnoise. By leveraging a smooth equivalence transformation, Resfusion determine the optimal acceleration step and maintains the integrity of existing noise schedules, unifying the training and inference processes. The experimental results demonstrate that Resfusion exhibits competitive performance on ISTD dataset, LOL dataset and Raindrop dataset with only five sampling steps. Furthermore, Resfusion can be easily applied to image generation and emerges with strong versatility. Our code and model are available at https://github.com/nkicsl/Resfusion.

Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise

TL;DR

Abstract

Paper Structure (22 sections, 24 equations, 18 figures, 8 tables, 2 algorithms)

This paper contains 22 sections, 24 equations, 18 figures, 8 tables, 2 algorithms.

Introduction
Methodology
Learning the resnoise
Smooth equivalence transformation
Experiments
Ablation Study
The analysis of the residual term and the noise term
Equivalent representations of the loss function
Discussion
Conclusion
Acknowledgements
Appendix Section
Detailed proof
Comparison with other methods
Image translation
...and 7 more sections

Figures (18)

Figure 1: The proposed Resfusion is a general framework for image restoration and can be easily expand to image generation (setting $\hat{x}_{0}=0$). We introduce the residual term ($R = \hat{x}_{0}-x_{0}$) into the forward process, redefine $q(x_{t} | x_{t-1})$ to $q(x_{t} | x_{t-1}, R)$ (as shown by the $\color{orange}{orange}$ arrow), and name this diffusion process as resnoise diffusion. Through employing a novel technique called "smooth equivalence transformation", we can directly use the degraded image $\hat{x}_{0}$ to obtain $x_{T'}$ (as shown by the $\color{blue}{blue}$ arrow). We bridge the gap between the input image and ground truth, unifying the training and inference processes.
Figure 2: The working principle of Resfusion. ${x}_{0}$ represents the distribution of the ground truth, while $\hat{x}_{0}$ represents the distribution of the degraded images. $\hat{x}_{0} - {x}_{0}$ represents the gap between them, defined as the residual term $R$ in Eq. \ref{['eq: residual term']}. Resfusion does not explicitly guide $\hat{x}_{0}$ to ${x}_{0}$. Instead, it implicitly learns the distribution of $R$ by doing resnoise-diffusion reverse process from $x_{t}$ to $x_{0}$. The resnoise-diffusion reverse process can be imagined as doing diffusion reverse process from $R+\epsilon$ to $x_{0}$ (as shown by the $\color{violet}{violet}$ arrow), guiding ${x}_{t}$ gradually towards ${x}_{0}$ along this direction. Following the principles of similar triangles, the coefficient of $R$ at step $t$ is computed as $1-\sqrt{\overline{\alpha}_{t}}$. At any step $t$ during the training process, ${x}_{t}$ can be calculated based on ${x}_{0}$ and $R$ through Eq. \ref{['eq: x_t_resnoise']}.
Figure 3: Visual comparisons of the restored results by different shadow-removal methods on the ISTD dataset.
Figure 4: Visual comparisons of the restored results by different image restoration methods on the LOL dataset and the Raindrop dataset.
Figure 5: The analysis of the residual term and the noise term on the LOL dataset. Only removing noise will reconstruct the details of the degraded image without causing any semantic shift. Only removing residual can only accomplish the semantic shift (from low-light to normal-light) without reconstructing the details. Removing resnoise can achieve both the semantic shift and the detail reconstruction.
...and 13 more figures

Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise

TL;DR

Abstract

Resfusion: Denoising Diffusion Probabilistic Models for Image Restoration Based on Prior Residual Noise

Authors

TL;DR

Abstract

Table of Contents

Figures (18)