Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models

Kyungsung Lee; Donggyu Lee; Myungjoo Kang

Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models

Kyungsung Lee, Donggyu Lee, Myungjoo Kang

TL;DR

SaFaRI introduces spatial-and-frequency-aware priors into diffusion-based image restoration by replacing pixel-domain fidelity with a transformed fidelity $\lVert \psi(\boldsymbol y)-\psi(\mathbf A \hat{\boldsymbol x}_{0|t}) \rVert_2^2$ that combines bicubic upsampling and Fourier-domain high/low-pass components. The method leverages an injective $\psi$ to decompose Fidelity into spatial, high-frequency, and low-frequency terms, with a theoretical bound ensuring stable conditioning $|p_{\psi,t}(\boldsymbol y|\boldsymbol x_t)-p_\psi(\boldsymbol y|\hat{\boldsymbol x}_{0|t})| \le \frac{1}{\mathrm{e}^{1/2}Z_{\psi} \gamma} L_{\psi} \|\mathbf A\| m_1$, and uses Tweedie's formula to relate $\hat{\boldsymbol x}_{0|t}$ to the score. Empirically, SaFaRI achieves state-of-the-art zero-shot IR performance on ImageNet and FFHQ across inpainting, denoising/deblurring, and super-resolution, surpassing DiffPIR, DPS, PnP-ADMM, and ILVR in LPIPS and FID, with qualitative improvements in texture and boundary fidelity. By enabling perceptual data fidelity in both spatial and spectral domains, SaFaRI offers a practically impactful, training-free boost to image restoration quality, while prompting further theoretical study of the transform-induced perturbations.

Abstract

Diffusion models have recently emerged as a promising framework for Image Restoration (IR), owing to their ability to produce high-quality reconstructions and their compatibility with established methods. Existing methods for solving noisy inverse problems in IR, considers the pixel-wise data-fidelity. In this paper, we propose SaFaRI, a spatial-and-frequency-aware diffusion model for IR with Gaussian noise. Our model encourages images to preserve data-fidelity in both the spatial and frequency domains, resulting in enhanced reconstruction quality. We comprehensively evaluate the performance of our model on a variety of noisy inverse problems, including inpainting, denoising, and super-resolution. Our thorough evaluation demonstrates that SaFaRI achieves state-of-the-art performance on both the ImageNet datasets and FFHQ datasets, outperforming existing zero-shot IR methods in terms of LPIPS and FID metrics.

Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models

TL;DR

SaFaRI introduces spatial-and-frequency-aware priors into diffusion-based image restoration by replacing pixel-domain fidelity with a transformed fidelity

that combines bicubic upsampling and Fourier-domain high/low-pass components. The method leverages an injective

to decompose Fidelity into spatial, high-frequency, and low-frequency terms, with a theoretical bound ensuring stable conditioning

, and uses Tweedie's formula to relate

to the score. Empirically, SaFaRI achieves state-of-the-art zero-shot IR performance on ImageNet and FFHQ across inpainting, denoising/deblurring, and super-resolution, surpassing DiffPIR, DPS, PnP-ADMM, and ILVR in LPIPS and FID, with qualitative improvements in texture and boundary fidelity. By enabling perceptual data fidelity in both spatial and spectral domains, SaFaRI offers a practically impactful, training-free boost to image restoration quality, while prompting further theoretical study of the transform-induced perturbations.

Abstract

Paper Structure (23 sections, 3 theorems, 28 equations, 13 figures, 12 tables, 1 algorithm)

This paper contains 23 sections, 3 theorems, 28 equations, 13 figures, 12 tables, 1 algorithm.

Introduction
Background
Score-based Diffusion Models
Image Restoration by Conditional Diffusion
Proposed Method
Modifying Data-fidelity
Theoretical Analysis
SaFaRI
Experiments
Implementation Details
Quantitative Experiments
Qualitative Experiments
Conclusion
Proofs
Experimental Details
...and 8 more sections

Key Result

Lemma 0

The modified conditional probability $p_{\psi}({\boldsymbol y} | {\boldsymbol x}_0)$ defined as (eq:14) is Lipschitz continuous with respect to ${\boldsymbol x}_0$.

Figures (13)

Figure 1: Examples and visual explanations of our method's functionality. (a)-(d): Results of the image restoration tasks: box-type inpainting, random-type inpainting, Gaussian deblurring and super resolution, respectively. (e): The first row illustrates the sequential changes in $A \hat{\boldsymbol x }_{0|t}$ after applying high-pass filtering, leading to the final filtered image of ${\boldsymbol y}$, while the second row presents the low-pass counterparts.
Figure 2: The overview of SaFaRI. Starting with the intermediate state ${\boldsymbol x}_t$, we first generate the unconditional prediction $\hat{\boldsymbol x }_{0|t}$ using the diffusion model. Then we obtain the next state ${\boldsymbol x}_{t-1}$ by leveraging the loss guidance terms obtained through bicubic upsampling $\psi_{s}$ with scaling factor $r$, high-pass filter $\psi_H$ and the low-pass filter $\psi_L$.
Figure 3: Qualitative results of image restoration. We establish the efficacy of SaFaRI in restoring images across a variety of tasks.
Figure 4: The results of SaFaRI, Gaussian blurring under different $\rho_t^H$ configurations. (left) The case $\rho_t^H = 0.25 / \sqrt{{\mathcal{L}}_H}$ (middle) The case $\rho_t^H = 1.25 / \sqrt{{\mathcal{L}}_H}$ (right) Ground Truth.
Figure 5: The results of SaFaRI, Gaussian blurring under different $\rho_t^H$ configurations. (left) The case $\rho_t^L = 1.25 / \sqrt{{\mathcal{L}}_L}$ (middle) The case $\rho_t^L = 0.25 / \sqrt{{\mathcal{L}}_L}$ (right) Ground Truth.
...and 8 more figures

Theorems & Definitions (6)

Lemma 0
Theorem 1
Remark 1
proof
Theorem 1
proof

Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models

TL;DR

Abstract

Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (6)