Table of Contents
Fetching ...

Restoration by Generation with Constrained Priors

Zheng Ding, Xuaner Zhang, Zhuowen Tu, Zhihao Xia

TL;DR

The paper tackles blind image restoration by leveraging a pretrained diffusion model and constraining its generative space with anchor images to preserve input identity. By adding noise to the degraded input and denoising with a fixed diffusion process, and by forming either a personal or generative album as anchors, the method directly samples high-quality, realistic restorations without relying on paired data or known degradation models. The approach shows superior performance on real-world blind face restoration benchmarks, achieves strong identity preservation in personalized restoration, and generalizes to non-face categories like dogs and cats. This restoration-by-generation framework offers practical benefits for real-world images with unknown degradations and enables subject-specific personalization through album-based space constraining.

Abstract

The inherent generative power of denoising diffusion models makes them well-suited for image restoration tasks where the objective is to find the optimal high-quality image within the generative space that closely resembles the input image. We propose a method to adapt a pretrained diffusion model for image restoration by simply adding noise to the input image to be restored and then denoise. Our method is based on the observation that the space of a generative model needs to be constrained. We impose this constraint by finetuning the generative model with a set of anchor images that capture the characteristics of the input image. With the constrained space, we can then leverage the sampling strategy used for generation to do image restoration. We evaluate against previous methods and show superior performances on multiple real-world restoration datasets in preserving identity and image quality. We also demonstrate an important and practical application on personalized restoration, where we use a personal album as the anchor images to constrain the generative space. This approach allows us to produce results that accurately preserve high-frequency details, which previous works are unable to do. Project webpage: https://gen2res.github.io.

Restoration by Generation with Constrained Priors

TL;DR

The paper tackles blind image restoration by leveraging a pretrained diffusion model and constraining its generative space with anchor images to preserve input identity. By adding noise to the degraded input and denoising with a fixed diffusion process, and by forming either a personal or generative album as anchors, the method directly samples high-quality, realistic restorations without relying on paired data or known degradation models. The approach shows superior performance on real-world blind face restoration benchmarks, achieves strong identity preservation in personalized restoration, and generalizes to non-face categories like dogs and cats. This restoration-by-generation framework offers practical benefits for real-world images with unknown degradations and enables subject-specific personalization through album-based space constraining.

Abstract

The inherent generative power of denoising diffusion models makes them well-suited for image restoration tasks where the objective is to find the optimal high-quality image within the generative space that closely resembles the input image. We propose a method to adapt a pretrained diffusion model for image restoration by simply adding noise to the input image to be restored and then denoise. Our method is based on the observation that the space of a generative model needs to be constrained. We impose this constraint by finetuning the generative model with a set of anchor images that capture the characteristics of the input image. With the constrained space, we can then leverage the sampling strategy used for generation to do image restoration. We evaluate against previous methods and show superior performances on multiple real-world restoration datasets in preserving identity and image quality. We also demonstrate an important and practical application on personalized restoration, where we use a personal album as the anchor images to constrain the generative space. This approach allows us to produce results that accurately preserve high-frequency details, which previous works are unable to do. Project webpage: https://gen2res.github.io.
Paper Structure (33 sections, 10 equations, 18 figures, 3 tables)

This paper contains 33 sections, 10 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: We harness the generative capacity of a diffusion model for image restoration. By constraining the generative space with a generative or personal album, we can directly use a pre-trained diffusion model to produce a high-quality and realistic image that is also faithful to the input identity. Without any assumption on the degradation type, we are able to generalize to real-world images that exhibit complicated degradation. We compare our restoration result with CodeFormer, a state-of-the-art baseline zhou2022towards. Our method generalizes better to different types of degradation while more faithfully preserving the input identity. Images are best viewed zoomed in on a big screen.
  • Figure 2: Left: Image projection. When sufficient Gaussian noise is added to the low- and high-quality image, we can bring them to the same distribution. The low-quality image can thus be denoised with a pre-trained diffusion model. Right: With and without space constraining. A regular diffusion step lands $y_t$ in an arbitrary position in the generative space; with space constraining, the path of generation becomes more constrained towards the space defined by the anchor images.
  • Figure 3: An illustration of our finetuning and inference stage. The core of our method is to constrain the generative space by fine-tuning a pre-trained diffusion model with either a generative album or a personal album. The generative album is generated from the input low-quality image with skip guidance to loosely follow the characteristics of the input. Once the generative space is constrained, at inference time, we can simply add noise to the input low-quality image and pass it through the diffusion model to do restoration.
  • Figure 4: Qualitative comparison with baselines on Wider-Test. With strong generative capacity of the diffusion model, our method performs well on severely degraded images. We are able to produce high-quality and realistic images while prior works suffer from unrealistic artifacts.
  • Figure 5: Comparison with previous methods on Deblur-Test. Previous methods do not include motion blur as part of the degradation simulation for training, and thus fail to restore the images. In contrast, our method does not make assumptions on the degradation types and generalizes more robustly.
  • ...and 13 more figures