Table of Contents
Fetching ...

Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion

Michail Dontas, Yutong He, Naoki Murata, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov

TL;DR

The paper tackles blind inverse image restoration where the degradation operator is unknown. It introduces LADiBI, a training-free framework that leverages large pre-trained text-to-image diffusion models and classifier-free guidance to encode priors via prompts, coupled with a diffusion posterior sampling scheme that jointly estimates the image and operator parameters. A novel operator initialization method and iterative refinement are used to handle both linear and nonlinear degradations without task-specific retraining. Empirical results on linear deblurring and nonlinear JPEG decompression across multiple image distributions demonstrate competitive or superior performance and highlight LADiBI’s flexibility and practical potential, albeit with longer inference times. Overall, the work significantly broadens the applicability of diffusion-based inverse problem solving to truly blind, diverse restoration tasks with minimal assumptions.

Abstract

This paper considers blind inverse image restoration, the task of predicting a target image from a degraded source when the degradation (i.e. the forward operator) is unknown. Existing solutions typically rely on restrictive assumptions such as operator linearity, curated training data or narrow image distributions limiting their practicality. We introduce LADiBI, a training-free method leveraging large-scale text-to-image diffusion to solve diverse blind inverse problems with minimal assumptions. Within a Bayesian framework, LADiBI uses text prompts to jointly encode priors for both target images and operators, unlocking unprecedented flexibility compared to existing methods. Additionally, we propose a novel diffusion posterior sampling algorithm that combines strategic operator initialization with iterative refinement of image and operator parameters, eliminating the need for highly constrained operator forms. Experiments show that LADiBI effectively handles both linear and challenging nonlinear image restoration problems across various image distributions, all without task-specific assumptions or retraining.

Blind Inverse Problem Solving Made Easy by Text-to-Image Latent Diffusion

TL;DR

The paper tackles blind inverse image restoration where the degradation operator is unknown. It introduces LADiBI, a training-free framework that leverages large pre-trained text-to-image diffusion models and classifier-free guidance to encode priors via prompts, coupled with a diffusion posterior sampling scheme that jointly estimates the image and operator parameters. A novel operator initialization method and iterative refinement are used to handle both linear and nonlinear degradations without task-specific retraining. Empirical results on linear deblurring and nonlinear JPEG decompression across multiple image distributions demonstrate competitive or superior performance and highlight LADiBI’s flexibility and practical potential, albeit with longer inference times. Overall, the work significantly broadens the applicability of diffusion-based inverse problem solving to truly blind, diverse restoration tasks with minimal assumptions.

Abstract

This paper considers blind inverse image restoration, the task of predicting a target image from a degraded source when the degradation (i.e. the forward operator) is unknown. Existing solutions typically rely on restrictive assumptions such as operator linearity, curated training data or narrow image distributions limiting their practicality. We introduce LADiBI, a training-free method leveraging large-scale text-to-image diffusion to solve diverse blind inverse problems with minimal assumptions. Within a Bayesian framework, LADiBI uses text prompts to jointly encode priors for both target images and operators, unlocking unprecedented flexibility compared to existing methods. Additionally, we propose a novel diffusion posterior sampling algorithm that combines strategic operator initialization with iterative refinement of image and operator parameters, eliminating the need for highly constrained operator forms. Experiments show that LADiBI effectively handles both linear and challenging nonlinear image restoration problems across various image distributions, all without task-specific assumptions or retraining.

Paper Structure

This paper contains 28 sections, 9 equations, 21 figures, 7 tables, 2 algorithms.

Figures (21)

  • Figure 1: Our proposed LADiBI is a training-free blind inverse problem solving algorithm for image restoration using large pre-trained text-to-image diffusion models. LADiBI is applicable to a wide variety of image distribution as well as operators with minimal modeling assumptions imposed.
  • Figure 2: A schematic overview of LADiBI (Algorithm \ref{['alg:main_alg']}).
  • Figure 3: Qualitative results on blind linear deblurring tasks. From top to bottom we showcase examples from motion deblur on FFHQ, Gaussian deblur on FFHQ, motion deblur on AFHQ, and Gaussian deblur on AFHQ respectively.
  • Figure 4: Qualitative results on the blind JPEG decompression task.
  • Figure 4: Ablation study on JPEG decompression.
  • ...and 16 more figures