Learning Many-to-Many Mapping for Unpaired Real-World Image Super-resolution and Downscaling

Wanjie Sun; Zhenzhong Chen

Learning Many-to-Many Mapping for Unpaired Real-World Image Super-resolution and Downscaling

Wanjie Sun, Zhenzhong Chen

TL;DR

Experimental results on real-world image SR datasets indicate that SDFlow can generate diverse realistic LR and SR images both quantitatively and qualitatively.

Abstract

Learning based single image super-resolution (SISR) for real-world images has been an active research topic yet a challenging task, due to the lack of paired low-resolution (LR) and high-resolution (HR) training images. Most of the existing unsupervised real-world SISR methods adopt a two-stage training strategy by synthesizing realistic LR images from their HR counterparts first, then training the super-resolution (SR) models in a supervised manner. However, the training of image degradation and SR models in this strategy are separate, ignoring the inherent mutual dependency between downscaling and its inverse upscaling process. Additionally, the ill-posed nature of image degradation is not fully considered. In this paper, we propose an image downscaling and SR model dubbed as SDFlow, which simultaneously learns a bidirectional many-to-many mapping between real-world LR and HR images unsupervisedly. The main idea of SDFlow is to decouple image content and degradation information in the latent space, where content information distribution of LR and HR images is matched in a common latent space. Degradation information of the LR images and the high-frequency information of the HR images are fitted to an easy-to-sample conditional distribution. Experimental results on real-world image SR datasets indicate that SDFlow can generate diverse realistic LR and SR images both quantitatively and qualitatively.

Learning Many-to-Many Mapping for Unpaired Real-World Image Super-resolution and Downscaling

TL;DR

Experimental results on real-world image SR datasets indicate that SDFlow can generate diverse realistic LR and SR images both quantitatively and qualitatively.

Abstract

Paper Structure (21 sections, 18 equations, 9 figures, 6 tables)

This paper contains 21 sections, 18 equations, 9 figures, 6 tables.

Introduction
Related Work
Real-world single image super-resolution
Normalizing flow and INN for low-level computer vision
Method
Problem statement
Deriving the solution for many-to-many image SR and downscaling under the variational inference framework
SDFlow
Normalizing flow
Model architecture
Training objectives
Experiments
Experimental setup
Datasets and evaluation metrics
Implementation details
...and 6 more sections

Figures (9)

Figure 1: The graphical model of the SDFlow. It simultaneously learns the diverse generation of real-world LR and SR images as a unified task.
Figure 2: Overview of our proposed SDFlow framework. In the forward pass, HR Flow maps the HR input into the latent space and divides the latent variable into $[\mathbf{z}_\text{c}, \mathbf{z}_\text{h}]$, representing the image content and the HF components. The HF Flow projects $\mathbf{z}_\text{h}$ to standard normal distributed $\mathbf{z}'_\text{h}$ under the conditional feature extracted from $\mathbf{z}_\text{c}$. LR Flow decouples the content $\mathbf{z}_\text{c}$ and degradation $\mathbf{z}_\text{d}$ of the input LR image in the latent space at the forward pass. Then the Deg Flow further transforms $\mathbf{z}_\text{d}$ into $\mathbf{z}'_\text{d}$ under the conditional features extracted from $\mathbf{z}_\text{c}$ such that $\mathbf{z}'_\text{d}$ follows the standard normal distributions. Conversely, the sampled HF and content variable of the LR image are taken into the reverse of HF Flow and HR Flow to generate the SR image, while the sampled degradation information and content variable of the HR image are used to generate the downscaled image during the backward pass of the Deg Flow and LR Flow.
Figure 3: Architecture of the LR Content Feature Extractor. It is designed based on the Resnet where Degradation Modulation is added to learn degradation adaptive transformation.
Figure 4: Qualitative $4\times$ SR images on the RealSR and DRealSR test dataset. Three parts of each SR image are magnified for better visual comparison. The proposed SDFlow can produce better visual pleasing textures.
Figure 5: Qualitative $4\times$ downscaled real-world HR images on the RealSR test dataset. Sample images are selected with typical complex textures and clear structural contents.
...and 4 more figures

Learning Many-to-Many Mapping for Unpaired Real-World Image Super-resolution and Downscaling

TL;DR

Abstract

Learning Many-to-Many Mapping for Unpaired Real-World Image Super-resolution and Downscaling

Authors

TL;DR

Abstract

Table of Contents

Figures (9)