Table of Contents
Fetching ...

Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration

Guy Ohayon, Tomer Michaeli, Michael Elad

TL;DR

This work targets the minimum-MSE reconstruction under a perfect perceptual index in photo-realistic image restoration by introducing Posterior-Mean Rectified Flow (PMRF). PMRF combines a posterior-mean predictor ${\hat{X}^* = \mathbb{E}[X|Y]}$ with a rectified flow that transports this mean to the ground-truth distribution ${p_X}$, effectively approximating ${\hat{X}_0}$ in pixel space. The method provides two-stage training, theoretical guarantees under ${\sigma_s=0}$, and empirical evidence across denoising, super-resolution, inpainting, colorization, and blind face restoration that PMRF outperforms posterior samplers, flow-from-Y, and latent-OT approaches. The results on CelebA-Test and real-world datasets demonstrate improved distortion metrics (PSNR, SSIM) while preserving perceptual quality (FID, KID, NIQE), indicating PMRF's practical impact for robust, high-fidelity image restoration. Overall, PMRF offers a principled, effective framework for achieving low distortion without sacrificing perceptual naturalness in diverse restoration tasks.

Abstract

Photo-realistic image restoration algorithms are typically evaluated by distortion measures (e.g., PSNR, SSIM) and by perceptual quality measures (e.g., FID, NIQE), where the desire is to attain the lowest possible distortion without compromising on perceptual quality. To achieve this goal, current methods commonly attempt to sample from the posterior distribution, or to optimize a weighted sum of a distortion loss (e.g., MSE) and a perceptual quality loss (e.g., GAN). Unlike previous works, this paper is concerned specifically with the optimal estimator that minimizes the MSE under a constraint of perfect perceptual index, namely where the distribution of the reconstructed images is equal to that of the ground-truth ones. A recent theoretical result shows that such an estimator can be constructed by optimally transporting the posterior mean prediction (MMSE estimate) to the distribution of the ground-truth images. Inspired by this result, we introduce Posterior-Mean Rectified Flow (PMRF), a simple yet highly effective algorithm that approximates this optimal estimator. In particular, PMRF first predicts the posterior mean, and then transports the result to a high-quality image using a rectified flow model that approximates the desired optimal transport map. We investigate the theoretical utility of PMRF and demonstrate that it consistently outperforms previous methods on a variety of image restoration tasks.

Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration

TL;DR

This work targets the minimum-MSE reconstruction under a perfect perceptual index in photo-realistic image restoration by introducing Posterior-Mean Rectified Flow (PMRF). PMRF combines a posterior-mean predictor with a rectified flow that transports this mean to the ground-truth distribution , effectively approximating in pixel space. The method provides two-stage training, theoretical guarantees under , and empirical evidence across denoising, super-resolution, inpainting, colorization, and blind face restoration that PMRF outperforms posterior samplers, flow-from-Y, and latent-OT approaches. The results on CelebA-Test and real-world datasets demonstrate improved distortion metrics (PSNR, SSIM) while preserving perceptual quality (FID, KID, NIQE), indicating PMRF's practical impact for robust, high-fidelity image restoration. Overall, PMRF offers a principled, effective framework for achieving low distortion without sacrificing perceptual naturalness in diverse restoration tasks.

Abstract

Photo-realistic image restoration algorithms are typically evaluated by distortion measures (e.g., PSNR, SSIM) and by perceptual quality measures (e.g., FID, NIQE), where the desire is to attain the lowest possible distortion without compromising on perceptual quality. To achieve this goal, current methods commonly attempt to sample from the posterior distribution, or to optimize a weighted sum of a distortion loss (e.g., MSE) and a perceptual quality loss (e.g., GAN). Unlike previous works, this paper is concerned specifically with the optimal estimator that minimizes the MSE under a constraint of perfect perceptual index, namely where the distribution of the reconstructed images is equal to that of the ground-truth ones. A recent theoretical result shows that such an estimator can be constructed by optimally transporting the posterior mean prediction (MMSE estimate) to the distribution of the ground-truth images. Inspired by this result, we introduce Posterior-Mean Rectified Flow (PMRF), a simple yet highly effective algorithm that approximates this optimal estimator. In particular, PMRF first predicts the posterior mean, and then transports the result to a high-quality image using a rectified flow model that approximates the desired optimal transport map. We investigate the theoretical utility of PMRF and demonstrate that it consistently outperforms previous methods on a variety of image restoration tasks.
Paper Structure (54 sections, 3 theorems, 37 equations, 20 figures, 12 tables)

This paper contains 54 sections, 3 theorems, 37 equations, 20 figures, 12 tables.

Key Result

Proposition 1

Suppose that $\sigma_{s}=0$, and let us assume that the solution of the ODE in eq:pmrf-gen exists and is unique. Then,

Figures (20)

  • Figure 1: Illustration of the distortion-perception tradeoff, where distortion is measured by MSE. Many photo-realistic image restoration methods aim for posterior sampling. Theoretically, this approach achieves a perfect perceptual index ($p_{\hat{X}}=p_{X}$) but its MSE is twice the MMSE. In contrast, we aim for the estimator ${\hat{X}_{0}}$ that minimizes the MSE under a perfect perceptual index constraint (Eq. (\ref{['eq:dmax']})), which typically achieves a smaller MSE than posterior sampling.
  • Figure 2: Visual results of PMRF (our method) on the CelebA-Test blind face image restoration data set. Our algorithm produces sharp and visually appealing details while maintaining incredibly low distortion according to a variety of measures simultaneously. See \ref{['table:celeba-test-blind']}.
  • Figure 3: Posterior-Mean Rectified Flow (PMRF)
  • Figure 4: Real-world face image restoration. Top: Qualitative results on inputs from the WIDER-Test data set. Bottom: Comparison on the "distortion"-perception plane (IndRMSE vs. FID), where IndRMSE indicates the RMSE of each method (the true distortion cannot be computed as there is no access to the ground-truth images). Our algorithm outperforms all other methods in IndRMSE, while achieving on-par perceptual quality compared to the state-of-the-art.
  • Figure 5: A controlled experiment comparing PMRF (our method) with several baseline methods, where the models are trained with the same architecture, hyper-parameters, etc. (see \ref{['section:ablation']}). Top: Qualitative comparison of PMRF and the baseline methods on several tasks. Bottom: Quantitative comparison on the distortion-perception plane. DOT is not a flow model, but rather another approach that attempts to approximate ${\hat{X}_{0}}$ (like PMRF). These experiments demonstrate that PMRF is either superior or is on-par with previous frameworks (i.e., posterior sampling or flowing from $Y$) on a variety of image restoration tasks. See \ref{['section:ablation']} for more details.
  • ...and 15 more figures

Theorems & Definitions (6)

  • Proposition 1
  • Example 1
  • Proposition 2
  • proof
  • Proposition 2
  • proof