Table of Contents
Fetching ...

Dual-domain Adaptation Networks for Realistic Image Super-resolution

Chaowei Fang, Bolin Fu, De Cheng, Lechao Cheng, Guanbin Li

TL;DR

This paper tackles realistic image super-resolution by bridging the gap between models trained on synthetic data and real-world degradations. It introduces Dual-domain Adaptation Networks (DAN), which combine a Spatial-Domain Adaptation (SDA) strategy with a Frequency-Domain Adaptation (FDA) branch to adapt pre-trained SR backbones (e.g., SwinIR) to real LR-HR pairs. SDA uses selective parameter updating with low-rank adapters to preserve low-level features while adapting to real data, and FDA merges FFT-based spectral information with backbone features to recover high-frequency details. Extensive experiments on RealSR, D2CRealSR, and DRealSR demonstrate state-of-the-art performance with far fewer trainable parameters than full fine-tuning, including robust cross-camera adaptation; ablation analyses validate the importance of FDA, SDA, and LoRA components. The work advances practical SR by enabling efficient transfer from simulated to realistic domains, with potential impact on surveillance, medical imaging, and consumer electronics where real degradations are prevalent.

Abstract

Realistic image super-resolution (SR) focuses on transforming real-world low-resolution (LR) images into high-resolution (HR) ones, handling more complex degradation patterns than synthetic SR tasks. This is critical for applications like surveillance, medical imaging, and consumer electronics. However, current methods struggle with limited real-world LR-HR data, impacting the learning of basic image features. Pre-trained SR models from large-scale synthetic datasets offer valuable prior knowledge, which can improve generalization, speed up training, and reduce the need for extensive real-world data in realistic SR tasks. In this paper, we introduce a novel approach, Dual-domain Adaptation Networks, which is able to efficiently adapt pre-trained image SR models from simulated to real-world datasets. To achieve this target, we first set up a spatial-domain adaptation strategy through selectively updating parameters of pre-trained models and employing the low-rank adaptation technique to adjust frozen parameters. Recognizing that image super-resolution involves recovering high-frequency components, we further integrate a frequency domain adaptation branch into the adapted model, which combines the spectral data of the input and the spatial-domain backbone's intermediate features to infer HR frequency maps, enhancing the SR result. Experimental evaluations on public realistic image SR benchmarks, including RealSR, D2CRealSR, and DRealSR, demonstrate the superiority of our proposed method over existing state-of-the-art models. Codes are available at: https://github.com/dummerchen/DAN.

Dual-domain Adaptation Networks for Realistic Image Super-resolution

TL;DR

This paper tackles realistic image super-resolution by bridging the gap between models trained on synthetic data and real-world degradations. It introduces Dual-domain Adaptation Networks (DAN), which combine a Spatial-Domain Adaptation (SDA) strategy with a Frequency-Domain Adaptation (FDA) branch to adapt pre-trained SR backbones (e.g., SwinIR) to real LR-HR pairs. SDA uses selective parameter updating with low-rank adapters to preserve low-level features while adapting to real data, and FDA merges FFT-based spectral information with backbone features to recover high-frequency details. Extensive experiments on RealSR, D2CRealSR, and DRealSR demonstrate state-of-the-art performance with far fewer trainable parameters than full fine-tuning, including robust cross-camera adaptation; ablation analyses validate the importance of FDA, SDA, and LoRA components. The work advances practical SR by enabling efficient transfer from simulated to realistic domains, with potential impact on surveillance, medical imaging, and consumer electronics where real degradations are prevalent.

Abstract

Realistic image super-resolution (SR) focuses on transforming real-world low-resolution (LR) images into high-resolution (HR) ones, handling more complex degradation patterns than synthetic SR tasks. This is critical for applications like surveillance, medical imaging, and consumer electronics. However, current methods struggle with limited real-world LR-HR data, impacting the learning of basic image features. Pre-trained SR models from large-scale synthetic datasets offer valuable prior knowledge, which can improve generalization, speed up training, and reduce the need for extensive real-world data in realistic SR tasks. In this paper, we introduce a novel approach, Dual-domain Adaptation Networks, which is able to efficiently adapt pre-trained image SR models from simulated to real-world datasets. To achieve this target, we first set up a spatial-domain adaptation strategy through selectively updating parameters of pre-trained models and employing the low-rank adaptation technique to adjust frozen parameters. Recognizing that image super-resolution involves recovering high-frequency components, we further integrate a frequency domain adaptation branch into the adapted model, which combines the spectral data of the input and the spatial-domain backbone's intermediate features to infer HR frequency maps, enhancing the SR result. Experimental evaluations on public realistic image SR benchmarks, including RealSR, D2CRealSR, and DRealSR, demonstrate the superiority of our proposed method over existing state-of-the-art models. Codes are available at: https://github.com/dummerchen/DAN.

Paper Structure

This paper contains 21 sections, 2 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Our dual-domain adaptation method aims to adapt pre-trained image super-resolution (SR) models from simulated to realistic datasets. The first and third rows show SR enhancements, and the second and fourth rows highlight improved high-frequency components. The second column presents the pre-trained DAT model outcomes chen2023dual. The third and fourth columns demonstrate visual and structural improvements achieved with our spatial and dual-domain adaptation strategies, respectively.
  • Figure 2: Overview of our proposed dual-domain adaptation networks. It is built upon a pre-trained image SR model like SwinIR liang2021swinir, which is constituted by a head convolution, $N$ Transformer-based feature enhancement modules, and an upsampler. As shown in the upper stream, a spatial-domain adaptation strategy is introduced by unfreezing tail units of each feature enhancement module and applying low-rank adapters to adjust the remain units. The bottom stream presents the frequency-domain branch which progressively accumulates the spectral signals for enhancing the recovery of high-frequency components.
  • Figure 3: (a) Within the adapted Transformer unit, low-rank adapters are incorporated to modify the parameters of the linear layers for generating the query and value variables. The workflow of the linear layer with low-rank adapter is illustrated in (b). The output of frozen vanilla linear layer in the left branch is adapted with residuals generated by a pair of down-projection and up-projection linear layers together with a scaling layer in the right branch.
  • Figure 4: Visualization of frequency signals of LR images simulated by bicubic interpolation (left) and realistic LR images (right).
  • Figure 5: Scatter plots of PSNR values, training time per epoch, and trainable parameter amount of different methods. Larger points indicate more trainable parameters.
  • ...and 5 more figures