Table of Contents
Fetching ...

PnP-Flow: Plug-and-Play Image Restoration with Flow Matching

Ségolène Martin, Anne Gagneux, Paul Hagemann, Gabriele Steidl

TL;DR

PnP-Flow Matching tackles ill-posed image restoration by uniting Plug-and-Play denoising with Flow Matching priors. It introduces a time-dependent denoiser D_t derived from a pre-trained velocity field v^θ and couples it with a reprojection/interpolation step in a Forward-Backward style scheme to keep iterates on the learned flow path. The method is memory-efficient, avoids backpropagating through ODEs, and uses a time-varying learning rate to balance data fidelity and denoising. Empirical results on denoising, deblurring, super-resolution, and inpainting on CelebA and AFHQ-Cat show competitive or superior PSNR/SSIM to state-of-the-art Flow Matching-based and diffusion-based PnP methods, with robust initialization behavior. The work broadens Flow Matching to practical restoration tasks and supports non-Gaussian latent distributions and straight-line flow models, suggesting avenues for future posterior-sampling applications.

Abstract

In this paper, we introduce Plug-and-Play (PnP) Flow Matching, an algorithm for solving imaging inverse problems. PnP methods leverage the strength of pre-trained denoisers, often deep neural networks, by integrating them in optimization schemes. While they achieve state-of-the-art performance on various inverse problems in imaging, PnP approaches face inherent limitations on more generative tasks like inpainting. On the other hand, generative models such as Flow Matching pushed the boundary in image sampling yet lack a clear method for efficient use in image restoration. We propose to combine the PnP framework with Flow Matching (FM) by defining a time-dependent denoiser using a pre-trained FM model. Our algorithm alternates between gradient descent steps on the data-fidelity term, reprojections onto the learned FM path, and denoising. Notably, our method is computationally efficient and memory-friendly, as it avoids backpropagation through ODEs and trace computations. We evaluate its performance on denoising, super-resolution, deblurring, and inpainting tasks, demonstrating superior results compared to existing PnP algorithms and Flow Matching based state-of-the-art methods.

PnP-Flow: Plug-and-Play Image Restoration with Flow Matching

TL;DR

PnP-Flow Matching tackles ill-posed image restoration by uniting Plug-and-Play denoising with Flow Matching priors. It introduces a time-dependent denoiser D_t derived from a pre-trained velocity field v^θ and couples it with a reprojection/interpolation step in a Forward-Backward style scheme to keep iterates on the learned flow path. The method is memory-efficient, avoids backpropagating through ODEs, and uses a time-varying learning rate to balance data fidelity and denoising. Empirical results on denoising, deblurring, super-resolution, and inpainting on CelebA and AFHQ-Cat show competitive or superior PSNR/SSIM to state-of-the-art Flow Matching-based and diffusion-based PnP methods, with robust initialization behavior. The work broadens Flow Matching to practical restoration tasks and supports non-Gaussian latent distributions and straight-line flow models, suggesting avenues for future posterior-sampling applications.

Abstract

In this paper, we introduce Plug-and-Play (PnP) Flow Matching, an algorithm for solving imaging inverse problems. PnP methods leverage the strength of pre-trained denoisers, often deep neural networks, by integrating them in optimization schemes. While they achieve state-of-the-art performance on various inverse problems in imaging, PnP approaches face inherent limitations on more generative tasks like inpainting. On the other hand, generative models such as Flow Matching pushed the boundary in image sampling yet lack a clear method for efficient use in image restoration. We propose to combine the PnP framework with Flow Matching (FM) by defining a time-dependent denoiser using a pre-trained FM model. Our algorithm alternates between gradient descent steps on the data-fidelity term, reprojections onto the learned FM path, and denoising. Notably, our method is computationally efficient and memory-friendly, as it avoids backpropagation through ODEs and trace computations. We evaluate its performance on denoising, super-resolution, deblurring, and inpainting tasks, demonstrating superior results compared to existing PnP algorithms and Flow Matching based state-of-the-art methods.
Paper Structure (46 sections, 2 theorems, 21 equations, 17 figures, 11 tables, 3 algorithms)

This paper contains 46 sections, 2 theorems, 21 equations, 17 figures, 11 tables, 3 algorithms.

Key Result

Proposition 1

Assume $v:=v^\theta$ is continuous and assume that, given $v$, the Flow ODE equation eq:flow_ode has a unique solution $f$. Then the denoising loss $\mathbb{E}_{(X_0,X_1) \sim \pi}[\Vert D_t(X_t) -X_1 \Vert^2]$ is equal to 0 for all $t \in [0,1]$, if and only if the couple $(f, v)$ is a straight-lin

Figures (17)

  • Figure 1: Our method on a 2D denoising task ($\sigma=1.5$) with Gaussian distributions. An OT Flow Matching model is trained to sample from $P_1 = \mathcal{N}(m, s^2 \mathrm{Id})$, with $m=7$ and $s=0.5$. At each time step, it performs a standard gradient step on the datafit, followed by a projection onto flow trajectories at time $t$, and finally applies the time-dependent denoiser $D_t$.
  • Figure 2: Illustration of the interpolation step.
  • Figure 3: Comparison of image restoration methods on CelebA: denoising (1st row), Gaussian deblurring (2nd row), super-resolution (3rd row), random pixel inpainting (4th row), box-inpainting (5th row). N/A means "method not applicable". Zoom in to see that PnP-Flow performs consistently well across all tasks compared to the baselines.
  • Figure 4: Comparison of image restoration methods on AFHQ-Cat: denoising (1st row), Gaussian deblurring (2nd row), super-resolution (3rd row), random pixel inpainting (4th row), box-inpainting (5th row). N/A means "method not applicable".
  • Figure 5: Dirichlet Flow Matching experiment on Simplex-MNIST, for denoising (1st row), super-resolution (2nd row), box-inpainting (3rd row). We measure the reconstruction error as the mean L2 distance (called MSE) between ground truth and reconstruction averaged over the 16 images.
  • ...and 12 more figures

Theorems & Definitions (6)

  • Proposition 1
  • Remark 2: Flow Matching versus diffusion models
  • Remark 3: Averaging in the denoising step
  • Proposition 4
  • proof
  • proof