Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution
Yutao Yuan, Chun Yuan
TL;DR
This work targets the ill-posed nature of image super-resolution by learning the conditional distribution of high-resolution images given low-resolution inputs through a continuous-time conditional diffusion model. It introduces Efficient Conditional Diffusion Model with Probability Flow Sampling (ECDP), which uses probability-flow ODE-based sampling to reduce generation time, and a hybrid parametrization for the denoiser to improve consistency across noise scales. An image-quality loss in feature space further aligns generated HR images with ground truth, enhancing perceptual fidelity. Across DIV2K, ImageNet, and CelebA, ECDP delivers superior perceptual SR quality while achieving faster sampling than existing diffusion-based methods, demonstrating practical gains for high-quality, efficient SR. The authors provide open-source code, facilitating reproducibility and broader adoption.
Abstract
Image super-resolution is a fundamentally ill-posed problem because multiple valid high-resolution images exist for one low-resolution image. Super-resolution methods based on diffusion probabilistic models can deal with the ill-posed nature by learning the distribution of high-resolution images conditioned on low-resolution images, avoiding the problem of blurry images in PSNR-oriented methods. However, existing diffusion-based super-resolution methods have high time consumption with the use of iterative sampling, while the quality and consistency of generated images are less than ideal due to problems like color shifting. In this paper, we propose Efficient Conditional Diffusion Model with Probability Flow Sampling (ECDP) for image super-resolution. To reduce the time consumption, we design a continuous-time conditional diffusion model for image super-resolution, which enables the use of probability flow sampling for efficient generation. Additionally, to improve the consistency of generated images, we propose a hybrid parametrization for the denoiser network, which interpolates between the data-predicting parametrization and the noise-predicting parametrization for different noise scales. Moreover, we design an image quality loss as a complement to the score matching loss of diffusion models, further improving the consistency and quality of super-resolution. Extensive experiments on DIV2K, ImageNet, and CelebA demonstrate that our method achieves higher super-resolution quality than existing diffusion-based image super-resolution methods while having lower time consumption. Our code is available at https://github.com/Yuan-Yutao/ECDP.
