Table of Contents
Fetching ...

Restora-Flow: Mask-Guided Image Restoration with Flow Matching

Arnela Hadzic, Franz Thaler, Lea Bogensperger, Simon Johannes Joham, Martin Urschler

TL;DR

Restora-Flow tackles mask-based image restoration by marrying flow matching with a mask-guided sampling strategy and a trajectory correction mechanism, all in a training-free framework. The method casts restoration as a MAP problem and uses a data-consistency driven fusion and learned trajectory corrections to keep generated content aligned with observed degraded regions. Empirically, it achieves superior perceptual quality (LPIPS) and competitive, often faster, reconstruction times across denoising, inpainting, and super-resolution on natural and medical datasets, outperforming both diffusion-based baselines and prior flow-based priors. This yields practical impact for fast, high-quality restoration in diverse domains, including medical imaging, where rapid and reliable reconstruction is critical.

Abstract

Flow matching has emerged as a promising generative approach that addresses the lengthy sampling times associated with state-of-the-art diffusion models and enables a more flexible trajectory design, while maintaining high-quality image generation. This capability makes it suitable as a generative prior for image restoration tasks. Although current methods leveraging flow models have shown promising results in restoration, some still suffer from long processing times or produce over-smoothed results. To address these challenges, we introduce Restora-Flow, a training-free method that guides flow matching sampling by a degradation mask and incorporates a trajectory correction mechanism to enforce consistency with degraded inputs. We evaluate our approach on both natural and medical datasets across several image restoration tasks involving a mask-based degradation, i.e., inpainting, super-resolution and denoising. We show superior perceptual quality and processing time compared to diffusion and flow matching-based reference methods.

Restora-Flow: Mask-Guided Image Restoration with Flow Matching

TL;DR

Restora-Flow tackles mask-based image restoration by marrying flow matching with a mask-guided sampling strategy and a trajectory correction mechanism, all in a training-free framework. The method casts restoration as a MAP problem and uses a data-consistency driven fusion and learned trajectory corrections to keep generated content aligned with observed degraded regions. Empirically, it achieves superior perceptual quality (LPIPS) and competitive, often faster, reconstruction times across denoising, inpainting, and super-resolution on natural and medical datasets, outperforming both diffusion-based baselines and prior flow-based priors. This yields practical impact for fast, high-quality restoration in diverse domains, including medical imaging, where rapid and reliable reconstruction is critical.

Abstract

Flow matching has emerged as a promising generative approach that addresses the lengthy sampling times associated with state-of-the-art diffusion models and enables a more flexible trajectory design, while maintaining high-quality image generation. This capability makes it suitable as a generative prior for image restoration tasks. Although current methods leveraging flow models have shown promising results in restoration, some still suffer from long processing times or produce over-smoothed results. To address these challenges, we introduce Restora-Flow, a training-free method that guides flow matching sampling by a degradation mask and incorporates a trajectory correction mechanism to enforce consistency with degraded inputs. We evaluate our approach on both natural and medical datasets across several image restoration tasks involving a mask-based degradation, i.e., inpainting, super-resolution and denoising. We show superior perceptual quality and processing time compared to diffusion and flow matching-based reference methods.

Paper Structure

This paper contains 17 sections, 9 equations, 11 figures, 1 table, 2 algorithms.

Figures (11)

  • Figure 1: We propose Restora-Flow, a mask-guided image restoration method based on flow matching with an integrated trajectory correction mechanism. Restora-Flow offers excellent performance in both reconstruction quality and processing efficiency when compared to state-of-the-art methods for various tasks. This is demonstrated in the results showing LPIPS perceptual quality score vs. processing time for the AFHQ-Cat dataset (left). Exemplary qualitative reconstruction results of Restora-Flow are shown across four different tasks and evaluated on four datasets with distinct characteristics (right).
  • Figure 2: Restora-Flow samples with and without correction steps. Empirically, one correction step ($C=1$) offers the best trade-off between high reconstruction quality and fast processing.
  • Figure 3: Qualitative results on CelebA. Shown are the degraded image (col 1), original image (col 2), restored images using related work as indicated (cols 3-8) and restored image of Restora-Flow (col 9). Rows refer to denoising (row 1), box inpainting (row 2), super-resolution (row 3) and random inpainting (row 4). RePaint is not applicable to denoising (N/A). Differences can be best seen in the pdf version.
  • Figure 4: Visual representation of quantitative results on CelebA. Restora-Flow () is compared to related work methods (other shapes) on four different tasks (colors). The plots show LPIPS $\downarrow$ (left), SSIM $\uparrow$ (center) and PSNR $\uparrow$ (right) on the y-axis, and processing time $\downarrow$ (all plots) on the x-axis. For better visualization and comparison, each plot is separated into two parts with different scales in the x-axis.
  • Figure 5: Ablation of ODE steps (indicated by markers) and correction steps $C$ for $2\times$ super-resolution on CelebA comparing LPIPS $\downarrow$ (top), SSIM $\uparrow$ (middle) and PSNR $\uparrow$ (bottom) to processing time $\downarrow$. ODE steps increase from left to right and represent 4, 8, 16, 32, 64, 128 and 256, respectively. For better visualization, ODE steps 4 and 8 when using $C=0$ are omitted. The circle indicates the selected hyperparameters. Time is per image and displayed on a logarithmic scale.
  • ...and 6 more figures