Table of Contents
Fetching ...

Continuous Degradation Modeling via Latent Flow Matching for Real-World Super-Resolution

Hyeonjae Kim, Dongjin Kim, Eugene Jin, Tae Hyun Kim

TL;DR

This work tackles the discrepancy between real-world image degradations and synthetic SR training data by proposing DegFlow, a two-stage framework that learns a degradation manifold in latent space to synthesize authentic LR images from a single HR input. It combines a Residual Autoencoder (RAE) to map images into a compact latent representation with multi-scale HR skip connections, and a Latent Flow Matching (LFM) model that learns continuous degradation trajectories via a natural cubic spline and a velocity field, enabling LR generation at unseen degradation levels through ODE integration. A perceptual LPIPS loss and third-order Taylor extrapolation enable supervision at intermediate scales, yielding more realistic degradations and sharper SR results. Experiments on RealSR and RealArbiSR show that SR models trained on DegFlow-generated data achieve state-of-the-art performance for both fixed-scale and arbitrary-scale SR, demonstrating practical impact for building large-scale, realistic SR datasets from HR images alone.

Abstract

While deep learning-based super-resolution (SR) methods have shown impressive outcomes with synthetic degradation scenarios such as bicubic downsampling, they frequently struggle to perform well on real-world images that feature complex, nonlinear degradations like noise, blur, and compression artifacts. Recent efforts to address this issue have involved the painstaking compilation of real low-resolution (LR) and high-resolution (HR) image pairs, usually limited to several specific downscaling factors. To address these challenges, our work introduces a novel framework capable of synthesizing authentic LR images from a single HR image by leveraging the latent degradation space with flow matching. Our approach generates LR images with realistic artifacts at unseen degradation levels, which facilitates the creation of large-scale, real-world SR training datasets. Comprehensive quantitative and qualitative assessments verify that our synthetic LR images accurately replicate real-world degradations. Furthermore, both traditional and arbitrary-scale SR models trained using our datasets consistently yield much better HR outcomes.

Continuous Degradation Modeling via Latent Flow Matching for Real-World Super-Resolution

TL;DR

This work tackles the discrepancy between real-world image degradations and synthetic SR training data by proposing DegFlow, a two-stage framework that learns a degradation manifold in latent space to synthesize authentic LR images from a single HR input. It combines a Residual Autoencoder (RAE) to map images into a compact latent representation with multi-scale HR skip connections, and a Latent Flow Matching (LFM) model that learns continuous degradation trajectories via a natural cubic spline and a velocity field, enabling LR generation at unseen degradation levels through ODE integration. A perceptual LPIPS loss and third-order Taylor extrapolation enable supervision at intermediate scales, yielding more realistic degradations and sharper SR results. Experiments on RealSR and RealArbiSR show that SR models trained on DegFlow-generated data achieve state-of-the-art performance for both fixed-scale and arbitrary-scale SR, demonstrating practical impact for building large-scale, realistic SR datasets from HR images alone.

Abstract

While deep learning-based super-resolution (SR) methods have shown impressive outcomes with synthetic degradation scenarios such as bicubic downsampling, they frequently struggle to perform well on real-world images that feature complex, nonlinear degradations like noise, blur, and compression artifacts. Recent efforts to address this issue have involved the painstaking compilation of real low-resolution (LR) and high-resolution (HR) image pairs, usually limited to several specific downscaling factors. To address these challenges, our work introduces a novel framework capable of synthesizing authentic LR images from a single HR image by leveraging the latent degradation space with flow matching. Our approach generates LR images with realistic artifacts at unseen degradation levels, which facilitates the creation of large-scale, real-world SR training datasets. Comprehensive quantitative and qualitative assessments verify that our synthetic LR images accurately replicate real-world degradations. Furthermore, both traditional and arbitrary-scale SR models trained using our datasets consistently yield much better HR outcomes.
Paper Structure (53 sections, 14 equations, 11 figures, 9 tables)

This paper contains 53 sections, 14 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: DegFlow generates real-world LR images across continuous scales by modeling degradation trajectories in a learned latent space. The generated LR images are used to train arbitrary SR models for high-quality restoration.
  • Figure 2: Overview of the proposed method. (a) Two-stage training phase. (b) Inference phase.
  • Figure 3: Visualization of continuous degradation. (a) Real images from the RealSR dataset at discrete scales (HR, ×2, ×3, ×4). (b) DegFlow-generated intermediate degradations at evenly spaced timesteps $0\leq t \leq 1$.
  • Figure 4: Normalized PSNR, CLIP, and FID scores across different timesteps on the RealSR $\times3$ test set.
  • Figure 5: Qualitative comparisons on the RealSR $\times3$ dataset. Fixed-scale SR results (HAN, HAT, MambaIR) and arbitrary-scale SR results (MetaSR, LIIF, CiaoSR) trained with either InterFlow (IF) generated LR (a) or our synthesized LR (b) are compared.
  • ...and 6 more figures