Swift Parameter-free Attention Network for Efficient Super-Resolution

Cheng Wan; Hongyuan Yu; Zhiqi Li; Yihang Chen; Yajun Zou; Yuqing Liu; Xuanwu Yin; Kunlong Zuo

Swift Parameter-free Attention Network for Efficient Super-Resolution

Cheng Wan, Hongyuan Yu, Zhiqi Li, Yihang Chen, Yajun Zou, Yuqing Liu, Xuanwu Yin, Kunlong Zuo

TL;DR

The paper tackles the efficiency limits of attention-based single-image super-resolution by introducing SPAN, a Swift Parameter-free Attention Network that uses a parameter-free attention mechanism. Attention maps are produced directly from convolutional outputs using an origin-symmetric activation $\sigma_a$, with residual connections forming the SPAB blocks that yield $O_i=U_i\odot V_i$ where $U_i=O_{i-1}\oplus H_i$ and $V_i=\sigma_a(H_i)$. The architecture stacks six SPABs with fusion and PixelShuffle upsampling, and employs re-parameterization to boost inference speed. Empirically, SPAN achieves state-of-the-art quality-speed trade-offs on multiple benchmarks, including NTIRE 2024 where it won both the overall and runtime tracks, while using substantially fewer parameters than baselines. These results suggest that parameter-free attention with symmetric activations and residuals can deliver competitive SR performance with practical efficiency, with potential applicability to other vision tasks.

Abstract

Single Image Super-Resolution (SISR) is a crucial task in low-level computer vision, aiming to reconstruct high-resolution images from low-resolution counterparts. Conventional attention mechanisms have significantly improved SISR performance but often result in complex network structures and large number of parameters, leading to slow inference speed and large model size. To address this issue, we propose the Swift Parameter-free Attention Network (SPAN), a highly efficient SISR model that balances parameter count, inference speed, and image quality. SPAN employs a novel parameter-free attention mechanism, which leverages symmetric activation functions and residual connections to enhance high-contribution information and suppress redundant information. Our theoretical analysis demonstrates the effectiveness of this design in achieving the attention mechanism's purpose. We evaluate SPAN on multiple benchmarks, showing that it outperforms existing efficient super-resolution models in terms of both image quality and inference speed, achieving a significant quality-speed trade-off. This makes SPAN highly suitable for real-world applications, particularly in resource-constrained scenarios. Notably, we won the first place both in the overall performance track and runtime track of the NTIRE 2024 efficient super-resolution challenge. Our code and models are made publicly available at https://github.com/hongyuanyu/SPAN.

Swift Parameter-free Attention Network for Efficient Super-Resolution

TL;DR

, with residual connections forming the SPAB blocks that yield

where

and

. The architecture stacks six SPABs with fusion and PixelShuffle upsampling, and employs re-parameterization to boost inference speed. Empirically, SPAN achieves state-of-the-art quality-speed trade-offs on multiple benchmarks, including NTIRE 2024 where it won both the overall and runtime tracks, while using substantially fewer parameters than baselines. These results suggest that parameter-free attention with symmetric activations and residuals can deliver competitive SR performance with practical efficiency, with potential applicability to other vision tasks.

Abstract

Paper Structure (15 sections, 7 equations, 14 figures, 6 tables)

This paper contains 15 sections, 7 equations, 14 figures, 6 tables.

Introduction
Related Work
Efficient Super Resolution on Image
Attention Mechanism
Method
Network Architecture.
Parameter-Free Attention Mechanism
Design Consideration
Experiments
Experimental Setup
Quantitative Results
Activation Function
Ablation Study
SPAN for NTIRE 2024 challenge
Conclusion

Figures (14)

Figure 1: Latency, PSNR, and complexity of model comparison on Set14 dataset in x4 scale factor task.
Figure 2: The proposed SPAN architecture. The yellow area indicates the internal structure of each SPAB module. $Att. Map_2$ denotes the generated attention map. Input is a low resolution image, and output is a high resolution image.
Figure 3: $H_1$
Figure 4: $V_1$
Figure 5: $O_1$
...and 9 more figures

Swift Parameter-free Attention Network for Efficient Super-Resolution

TL;DR

Abstract

Swift Parameter-free Attention Network for Efficient Super-Resolution

Authors

TL;DR

Abstract

Table of Contents

Figures (14)