Table of Contents
Fetching ...

Particle-Filtering-based Latent Diffusion for Inverse Problems

Amir Nazemi, Mohammad Hadi Sepanj, Nicholas Pellegrino, Chris Czarnecki, Paul Fieguth

TL;DR

The paper tackles robustness and variability in diffusion-model–based inverse problems by introducing PFLD, a particle-filtering framework that runs multiple latent-space samples as particles during the early reverse diffusion steps. Each particle is guided toward data-consistency via a Cauchy-based likelihood and selectively pruned, enabling broader exploration of the solution space with fewer overall diffusion runs. Empirical results on FFHQ-1K and ImageNet-1K demonstrate that PFLD-10 improves over PSLD on super-resolution and inpainting, with competitive performance in Gaussian deblurring, while offering substantial runtime advantages over repeated PSLD inferences. This approach provides a general, robust mechanism to mitigate initialization sensitivity in diffusion-based inverse problem solvers and sets up a path toward more sophisticated particle-filtering strategies in latent diffusion.

Abstract

Current strategies for solving image-based inverse problems apply latent diffusion models to perform posterior sampling.However, almost all approaches make no explicit attempt to explore the solution space, instead drawing only a single sample from a Gaussian distribution from which to generate their solution. In this paper, we introduce a particle-filtering-based framework for a nonlinear exploration of the solution space in the initial stages of reverse SDE methods. Our proposed particle-filtering-based latent diffusion (PFLD) method and proposed problem formulation and framework can be applied to any diffusion-based solution for linear or nonlinear inverse problems. Our experimental results show that PFLD outperforms the SoTA solver PSLD on the FFHQ-1K and ImageNet-1K datasets on inverse problem tasks of super resolution, Gaussian debluring and inpainting.

Particle-Filtering-based Latent Diffusion for Inverse Problems

TL;DR

The paper tackles robustness and variability in diffusion-model–based inverse problems by introducing PFLD, a particle-filtering framework that runs multiple latent-space samples as particles during the early reverse diffusion steps. Each particle is guided toward data-consistency via a Cauchy-based likelihood and selectively pruned, enabling broader exploration of the solution space with fewer overall diffusion runs. Empirical results on FFHQ-1K and ImageNet-1K demonstrate that PFLD-10 improves over PSLD on super-resolution and inpainting, with competitive performance in Gaussian deblurring, while offering substantial runtime advantages over repeated PSLD inferences. This approach provides a general, robust mechanism to mitigate initialization sensitivity in diffusion-based inverse problem solvers and sets up a path toward more sophisticated particle-filtering strategies in latent diffusion.

Abstract

Current strategies for solving image-based inverse problems apply latent diffusion models to perform posterior sampling.However, almost all approaches make no explicit attempt to explore the solution space, instead drawing only a single sample from a Gaussian distribution from which to generate their solution. In this paper, we introduce a particle-filtering-based framework for a nonlinear exploration of the solution space in the initial stages of reverse SDE methods. Our proposed particle-filtering-based latent diffusion (PFLD) method and proposed problem formulation and framework can be applied to any diffusion-based solution for linear or nonlinear inverse problems. Our experimental results show that PFLD outperforms the SoTA solver PSLD on the FFHQ-1K and ImageNet-1K datasets on inverse problem tasks of super resolution, Gaussian debluring and inpainting.
Paper Structure (26 sections, 17 equations, 12 figures, 3 tables, 1 algorithm)

This paper contains 26 sections, 17 equations, 12 figures, 3 tables, 1 algorithm.

Figures (12)

  • Figure 1: In the proposed PFLD method, multiple random samples, $z_l$, in the latent space of a diffusion model act as particles. Starting from an outer manifold, $\mathcal{M}_T$, indistinguishable from the space of random noise, these particles progress through steps of reverse diffusion (via $s_{\theta^*}$) towards $\mathcal{M}_0$, a manifold corresponding to noise-free images. The line $\ell$ corresponds to the set of plausible solutions, all of which map to the measurement, $y$, through the forward process $\mathcal{A}$. Particles are also guided towards $\ell$ through a gradient term, $\nabla_{x^t}||y-\mathcal{A}(\mathcal{D}(\hat{z}^0_l))||_2^2$, based on intermediate estimates, $\hat{z}^0_l$, of where the particle will be at the end of reverse diffusion. Particles farther from $\ell$, destined be far from the ground truth solution, $z^*$, at the end of reverse diffusion, are pruned (marked with a red ✗). A final singular particle reaches $\mathcal{M}_0$ producing the final estimated solution, $\hat{z}^0$.
  • Figure 2: Ten different runs of PSLD rout2023solving with different initial points with different random seeds, $z^T_l$, produce different estimates, $\hat{x}^0_l$, of the ground truth image $x^*$. The forward model, $\mathcal{A}$, masks the center of $x^*$. The estimator only has access to the measurement $y$ to solve the inverse problem. The image used in this figure is from the FFHQ dataset karras2019style.
  • Figure 3: Qualitative results. For the super-resolution task, PFLD reconstructs the face with more detail (freckles, wrinkles), while PSLD struggles to accurately reconstruct the lock of hair falling onto the bandage. For the inpainting task PSLD struggled to accurately reconstruct the eyes and the bridge of the nose, while PFLD achieved a believable result. For the Gaussian deblurring task PFLD generated a slightly sharper image with fewer artifacts. The image in the first and third columns of this figure are from ImageNet-1K and the inpaiting example is from FFHQ-1K ($512\times512$).
  • Figure 4: The effect of varying the number of particles ${1,5,10,15,20,25,30}$ on different metrics on the first $10$ images of FFHQ-1K ($512\times512$) dataset on two tasks: Super Resolution ($\times 8$) and Gaussian Deblurring. We repeated each experiment $10$ times. The shaded regions show the standard deviation of each experiments, indicating the degree of variance in the results.
  • Figure 5: The effect of varying the number of initial particles on the running time of the PFLD algorithm. Results are shown for the SR ($\times 8$) inverse problem for a single image from the FFHQ-1K $512\times512$ dataset. Furthermore, a comparison to the time required to simply run PSLD an equivalent number of times is given, where it is clear that PFLD is far more efficient. As illustrated, having many initial particles and then pruning allows the proposed PFLD method to broadly explore the solution space while still being efficient. The runtime for repeated PSLD is linear with the number of repetitions, $N$.
  • ...and 7 more figures