Table of Contents
Fetching ...

SILO: Solving Inverse Problems with Latent Operators

Ron Raphaeli, Sean Man, Michael Elad

TL;DR

SILO proposes solving inverse problems with latent diffusion models by learning a latent degradation operator that emulates image-space degradations entirely in the latent space, removing the need to backpropagate through the autoencoder during diffusion. By encoding the measurement and guiding diffusion in latent space with H_ heta, SILO achieves faster sampling and improved perceptual fidelity compared to prior latent-diffusion methods. The paper demonstrates state-of-the-art performance across Gaussian blur, super-resolution, inpainting, and JPEG restoration on FFHQ and COCO, with robust handling of measurement noise and informative text conditioning. This latent-domain approach significantly narrows the gap between data-consistency and latent priors, enabling scalable, high-quality inverse problem solving with latent diffusion models.

Abstract

Consistent improvement of image priors over the years has led to the development of better inverse problem solvers. Diffusion models are the newcomers to this arena, posing the strongest known prior to date. Recently, such models operating in a latent space have become increasingly predominant due to their efficiency. In recent works, these models have been applied to solve inverse problems. Working in the latent space typically requires multiple applications of an Autoencoder during the restoration process, which leads to both computational and restoration quality challenges. In this work, we propose a new approach for handling inverse problems with latent diffusion models, where a learned degradation function operates within the latent space, emulating a known image space degradation. Usage of the learned operator reduces the dependency on the Autoencoder to only the initial and final steps of the restoration process, facilitating faster sampling and superior restoration quality. We demonstrate the effectiveness of our method on a variety of image restoration tasks and datasets, achieving significant improvements over prior art.

SILO: Solving Inverse Problems with Latent Operators

TL;DR

SILO proposes solving inverse problems with latent diffusion models by learning a latent degradation operator that emulates image-space degradations entirely in the latent space, removing the need to backpropagate through the autoencoder during diffusion. By encoding the measurement and guiding diffusion in latent space with H_ heta, SILO achieves faster sampling and improved perceptual fidelity compared to prior latent-diffusion methods. The paper demonstrates state-of-the-art performance across Gaussian blur, super-resolution, inpainting, and JPEG restoration on FFHQ and COCO, with robust handling of measurement noise and informative text conditioning. This latent-domain approach significantly narrows the gap between data-consistency and latent priors, enabling scalable, high-quality inverse problem solving with latent diffusion models.

Abstract

Consistent improvement of image priors over the years has led to the development of better inverse problem solvers. Diffusion models are the newcomers to this arena, posing the strongest known prior to date. Recently, such models operating in a latent space have become increasingly predominant due to their efficiency. In recent works, these models have been applied to solve inverse problems. Working in the latent space typically requires multiple applications of an Autoencoder during the restoration process, which leads to both computational and restoration quality challenges. In this work, we propose a new approach for handling inverse problems with latent diffusion models, where a learned degradation function operates within the latent space, emulating a known image space degradation. Usage of the learned operator reduces the dependency on the Autoencoder to only the initial and final steps of the restoration process, facilitating faster sampling and superior restoration quality. We demonstrate the effectiveness of our method on a variety of image restoration tasks and datasets, achieving significant improvements over prior art.
Paper Structure (34 sections, 25 equations, 17 figures, 15 tables, 1 algorithm)

This paper contains 34 sections, 25 equations, 17 figures, 15 tables, 1 algorithm.

Figures (17)

  • Figure 1: Measurements and their corresponding reconstructions using our proposed latent inverse solver, SILO (\ref{['alg:reconstruction']}).
  • Figure 2: Computational schemes of prior work and SILO. (a): Prior work enforces consistency to the measurement in pixel space, resulting in differentiation through the decoder. (b): SILO keeps all calculations in the latent space. This allows faster reconstructions while improving their perceptual quality compared to prior work, as seen in \ref{['fig:ffhq comparison', 'tab:results_of_sr4_and_gb', 'tab:results_of_sr8_and_ip']}.
  • Figure 3: Training scheme of the latent operator,$\mathbf{H_\theta}$. In training, gradients flow from $\mathcal{L}$ to update the parameters of $H_\theta$. Note that no gradients pass through the pixel space. $H_\theta$ learns to mimic the effect of the degradation operator in the latent space, allowing us to use SILO to solve inverse problems using LDMs.
  • Figure 4: Comparison of SILO to other methods. From left to right: the clean image, $x$, the measurement $y$, the reconstructions using SILO (with RV and SD), ReSample, and PSLD. Each row contains the image and a zoom-in to show the differences better. From top to bottom the degredations are inpainting, super-resolution (8), and Gaussian blur. The settings are detailed in \ref{['sec:Degradations']}.
  • Figure 5: Restorations of masked images from the COCO dataset.
  • ...and 12 more figures