Table of Contents
Fetching ...

High-Resolution Be Aware! Improving the Self-Supervised Real-World Super-Resolution

Yuehan Zhang, Angela Yao

TL;DR

A controller to adjust the degradation modeling based on the quality of super-resolution results is proposed and a novel feature-alignment regularizer is introduced that directly constrains the distribution of super-resolved images.

Abstract

Self-supervised learning is crucial for super-resolution because ground-truth images are usually unavailable for real-world settings. Existing methods derive self-supervision from low-resolution images by creating pseudo-pairs or by enforcing a low-resolution reconstruction objective. These methods struggle with insufficient modeling of real-world degradations and the lack of knowledge about high-resolution imagery, resulting in unnatural super-resolved results. This paper strengthens awareness of the high-resolution image to improve the self-supervised real-world super-resolution. We propose a controller to adjust the degradation modeling based on the quality of super-resolution results. We also introduce a novel feature-alignment regularizer that directly constrains the distribution of super-resolved images. Our method finetunes the off-the-shelf SR models for a target real-world domain. Experiments show that it produces natural super-resolved images with state-of-the-art perceptual performance.

High-Resolution Be Aware! Improving the Self-Supervised Real-World Super-Resolution

TL;DR

A controller to adjust the degradation modeling based on the quality of super-resolution results is proposed and a novel feature-alignment regularizer is introduced that directly constrains the distribution of super-resolved images.

Abstract

Self-supervised learning is crucial for super-resolution because ground-truth images are usually unavailable for real-world settings. Existing methods derive self-supervision from low-resolution images by creating pseudo-pairs or by enforcing a low-resolution reconstruction objective. These methods struggle with insufficient modeling of real-world degradations and the lack of knowledge about high-resolution imagery, resulting in unnatural super-resolved results. This paper strengthens awareness of the high-resolution image to improve the self-supervised real-world super-resolution. We propose a controller to adjust the degradation modeling based on the quality of super-resolution results. We also introduce a novel feature-alignment regularizer that directly constrains the distribution of super-resolved images. Our method finetunes the off-the-shelf SR models for a target real-world domain. Experiments show that it produces natural super-resolved images with state-of-the-art perceptual performance.

Paper Structure

This paper contains 15 sections, 12 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: (a) Previous self-supervised SR methods depend on degradation modeling and are limited by only using knowledge of low-resolution (LR) images. (b) Our method strengthens the awareness of high-resolution (HR) images for self-supervision. We adjust the degradation modeling according to the quality of super-resolved HR images and incorporate a prior on HR imagery.
  • Figure 2: (a) Architecture of LR reconstruction network in chen2024low. (b) The numbers (PSNR$\uparrow$/ LPIPS$\downarrow$) respectively. When the embedding $e_d$ fails to reproduce the blur effect in $X$, passing the ground-truth $Y_{gt}$ through $R$ has lower reconstruction performance than its blurred version.
  • Figure 3: Overview of our method. In each stage, only colored modules are optimized, and LR images are interpolated for better visuals. (a) The pretraining of LR reconstruction network. Parameters in FAR and the reconstruction network ($E_{deg}$, $E_{img}$, and $R$) are optimized with $\mathcal{L}_{rec}$ and $\Phi_{far}$. Controller $\mathbf{s}$ adjusts the degradation embedding $e_d$. LR input $X^s$ is synthesized from HR image $Y^s_{gt}$. (b) Finetuning of super-resolution model $\mathrm{M}$. Given a real-world LR image $X^r$, we input the HR results $\mathrm{M}(X^r)$ to $E_{img}$ and $E_{clip}$. Only parameters in $\mathrm{M}$ are finetuned by $L_{rec}$ and $\Phi_{far}$. The relationships between $\mathbf{s}$ and HQI are different in finetuning and pretraining.
  • Figure 4: Details of the Feature-Alignment Regularizer.
  • Figure 5: Visual comparison with state-of-the-art methods (zoom-in for better views). RE refers to RealESRGAN+ and columns (e-g) are the results of methods finetuning RE. Compared to SRTTA deng2023efficient and LWay chen2024low, our method restores shapes close to HR images, such as windows in the second row and triangles in the third row, and generates realistic patterns.
  • ...and 4 more figures