Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models
Kyungsung Lee, Donggyu Lee, Myungjoo Kang
TL;DR
SaFaRI introduces spatial-and-frequency-aware priors into diffusion-based image restoration by replacing pixel-domain fidelity with a transformed fidelity $\lVert \psi(\boldsymbol y)-\psi(\mathbf A \hat{\boldsymbol x}_{0|t}) \rVert_2^2$ that combines bicubic upsampling and Fourier-domain high/low-pass components. The method leverages an injective $\psi$ to decompose Fidelity into spatial, high-frequency, and low-frequency terms, with a theoretical bound ensuring stable conditioning $|p_{\psi,t}(\boldsymbol y|\boldsymbol x_t)-p_\psi(\boldsymbol y|\hat{\boldsymbol x}_{0|t})| \le \frac{1}{\mathrm{e}^{1/2}Z_{\psi} \gamma} L_{\psi} \|\mathbf A\| m_1$, and uses Tweedie's formula to relate $\hat{\boldsymbol x}_{0|t}$ to the score. Empirically, SaFaRI achieves state-of-the-art zero-shot IR performance on ImageNet and FFHQ across inpainting, denoising/deblurring, and super-resolution, surpassing DiffPIR, DPS, PnP-ADMM, and ILVR in LPIPS and FID, with qualitative improvements in texture and boundary fidelity. By enabling perceptual data fidelity in both spatial and spectral domains, SaFaRI offers a practically impactful, training-free boost to image restoration quality, while prompting further theoretical study of the transform-induced perturbations.
Abstract
Diffusion models have recently emerged as a promising framework for Image Restoration (IR), owing to their ability to produce high-quality reconstructions and their compatibility with established methods. Existing methods for solving noisy inverse problems in IR, considers the pixel-wise data-fidelity. In this paper, we propose SaFaRI, a spatial-and-frequency-aware diffusion model for IR with Gaussian noise. Our model encourages images to preserve data-fidelity in both the spatial and frequency domains, resulting in enhanced reconstruction quality. We comprehensively evaluate the performance of our model on a variety of noisy inverse problems, including inpainting, denoising, and super-resolution. Our thorough evaluation demonstrates that SaFaRI achieves state-of-the-art performance on both the ImageNet datasets and FFHQ datasets, outperforming existing zero-shot IR methods in terms of LPIPS and FID metrics.
