Table of Contents
Fetching ...

Image Super-Resolution with Taylor Expansion Approximation and Large Field Reception

Jiancong Feng, Yuan-Gen Wang, Mingjie Li, Fengchuang Xing

TL;DR

This study addresses blind SR by reducing the computational burden of self-similarity via a second-order Taylor expansion (STEA) of the softmax-attention mechanism, paired with a Multi-Scale Large Field Reception (MLFR) to recover lost performance. The approach is instantiated in LabNet for laboratory-style degradations and RealNet for real-world degradations, with RealNet-GAN extending realism through perceptual and adversarial losses. Ablation studies confirm that STEA and MLFR jointly yield the best performance with linear-like complexity, enabling efficient, high-quality blind SR on diverse datasets. The proposed methods offer practical impact for deploying blind SR on resource-constrained devices while maintaining competitive visual fidelity in real-world scenarios.

Abstract

Self-similarity techniques are booming in blind super-resolution (SR) due to accurate estimation of the degradation types involved in low-resolution images. However, high-dimensional matrix multiplication within self-similarity computation prohibitively consumes massive computational costs. We find that the high-dimensional attention map is derived from the matrix multiplication between Query and Key, followed by a softmax function. This softmax makes the matrix multiplication between Query and Key inseparable, posing a great challenge in simplifying computational complexity. To address this issue, we first propose a second-order Taylor expansion approximation (STEA) to separate the matrix multiplication of Query and Key, resulting in the complexity reduction from $\mathcal{O}(N^2)$ to $\mathcal{O}(N)$. Then, we design a multi-scale large field reception (MLFR) to compensate for the performance degradation caused by STEA. Finally, we apply these two core designs to laboratory and real-world scenarios by constructing LabNet and RealNet, respectively. Extensive experimental results tested on five synthetic datasets demonstrate that our LabNet sets a new benchmark in qualitative and quantitative evaluations. Tested on the RealWorld38 dataset, our RealNet achieves superior visual quality over existing methods. Ablation studies further verify the contributions of STEA and MLFR towards both LabNet and RealNet frameworks.

Image Super-Resolution with Taylor Expansion Approximation and Large Field Reception

TL;DR

This study addresses blind SR by reducing the computational burden of self-similarity via a second-order Taylor expansion (STEA) of the softmax-attention mechanism, paired with a Multi-Scale Large Field Reception (MLFR) to recover lost performance. The approach is instantiated in LabNet for laboratory-style degradations and RealNet for real-world degradations, with RealNet-GAN extending realism through perceptual and adversarial losses. Ablation studies confirm that STEA and MLFR jointly yield the best performance with linear-like complexity, enabling efficient, high-quality blind SR on diverse datasets. The proposed methods offer practical impact for deploying blind SR on resource-constrained devices while maintaining competitive visual fidelity in real-world scenarios.

Abstract

Self-similarity techniques are booming in blind super-resolution (SR) due to accurate estimation of the degradation types involved in low-resolution images. However, high-dimensional matrix multiplication within self-similarity computation prohibitively consumes massive computational costs. We find that the high-dimensional attention map is derived from the matrix multiplication between Query and Key, followed by a softmax function. This softmax makes the matrix multiplication between Query and Key inseparable, posing a great challenge in simplifying computational complexity. To address this issue, we first propose a second-order Taylor expansion approximation (STEA) to separate the matrix multiplication of Query and Key, resulting in the complexity reduction from to . Then, we design a multi-scale large field reception (MLFR) to compensate for the performance degradation caused by STEA. Finally, we apply these two core designs to laboratory and real-world scenarios by constructing LabNet and RealNet, respectively. Extensive experimental results tested on five synthetic datasets demonstrate that our LabNet sets a new benchmark in qualitative and quantitative evaluations. Tested on the RealWorld38 dataset, our RealNet achieves superior visual quality over existing methods. Ablation studies further verify the contributions of STEA and MLFR towards both LabNet and RealNet frameworks.
Paper Structure (18 sections, 20 equations, 10 figures, 8 tables)

This paper contains 18 sections, 20 equations, 10 figures, 8 tables.

Figures (10)

  • Figure 1: The training and testing procedures of RealNet.
  • Figure 2: The overview of our proposed LabNet.
  • Figure 3: Analysis of the execution process of the Non-Local Attention (NLA) module.
  • Figure 4: The Second-Order Taylor Expansion Approximation (STEA) module.
  • Figure 5: The Multi-Scale Large Field Reception (MLFR) module.
  • ...and 5 more figures