Image Super-Resolution with Taylor Expansion Approximation and Large Field Reception
Jiancong Feng, Yuan-Gen Wang, Mingjie Li, Fengchuang Xing
TL;DR
This study addresses blind SR by reducing the computational burden of self-similarity via a second-order Taylor expansion (STEA) of the softmax-attention mechanism, paired with a Multi-Scale Large Field Reception (MLFR) to recover lost performance. The approach is instantiated in LabNet for laboratory-style degradations and RealNet for real-world degradations, with RealNet-GAN extending realism through perceptual and adversarial losses. Ablation studies confirm that STEA and MLFR jointly yield the best performance with linear-like complexity, enabling efficient, high-quality blind SR on diverse datasets. The proposed methods offer practical impact for deploying blind SR on resource-constrained devices while maintaining competitive visual fidelity in real-world scenarios.
Abstract
Self-similarity techniques are booming in blind super-resolution (SR) due to accurate estimation of the degradation types involved in low-resolution images. However, high-dimensional matrix multiplication within self-similarity computation prohibitively consumes massive computational costs. We find that the high-dimensional attention map is derived from the matrix multiplication between Query and Key, followed by a softmax function. This softmax makes the matrix multiplication between Query and Key inseparable, posing a great challenge in simplifying computational complexity. To address this issue, we first propose a second-order Taylor expansion approximation (STEA) to separate the matrix multiplication of Query and Key, resulting in the complexity reduction from $\mathcal{O}(N^2)$ to $\mathcal{O}(N)$. Then, we design a multi-scale large field reception (MLFR) to compensate for the performance degradation caused by STEA. Finally, we apply these two core designs to laboratory and real-world scenarios by constructing LabNet and RealNet, respectively. Extensive experimental results tested on five synthetic datasets demonstrate that our LabNet sets a new benchmark in qualitative and quantitative evaluations. Tested on the RealWorld38 dataset, our RealNet achieves superior visual quality over existing methods. Ablation studies further verify the contributions of STEA and MLFR towards both LabNet and RealNet frameworks.
