Swift Parameter-free Attention Network for Efficient Super-Resolution
Cheng Wan, Hongyuan Yu, Zhiqi Li, Yihang Chen, Yajun Zou, Yuqing Liu, Xuanwu Yin, Kunlong Zuo
TL;DR
The paper tackles the efficiency limits of attention-based single-image super-resolution by introducing SPAN, a Swift Parameter-free Attention Network that uses a parameter-free attention mechanism. Attention maps are produced directly from convolutional outputs using an origin-symmetric activation $\sigma_a$, with residual connections forming the SPAB blocks that yield $O_i=U_i\odot V_i$ where $U_i=O_{i-1}\oplus H_i$ and $V_i=\sigma_a(H_i)$. The architecture stacks six SPABs with fusion and PixelShuffle upsampling, and employs re-parameterization to boost inference speed. Empirically, SPAN achieves state-of-the-art quality-speed trade-offs on multiple benchmarks, including NTIRE 2024 where it won both the overall and runtime tracks, while using substantially fewer parameters than baselines. These results suggest that parameter-free attention with symmetric activations and residuals can deliver competitive SR performance with practical efficiency, with potential applicability to other vision tasks.
Abstract
Single Image Super-Resolution (SISR) is a crucial task in low-level computer vision, aiming to reconstruct high-resolution images from low-resolution counterparts. Conventional attention mechanisms have significantly improved SISR performance but often result in complex network structures and large number of parameters, leading to slow inference speed and large model size. To address this issue, we propose the Swift Parameter-free Attention Network (SPAN), a highly efficient SISR model that balances parameter count, inference speed, and image quality. SPAN employs a novel parameter-free attention mechanism, which leverages symmetric activation functions and residual connections to enhance high-contribution information and suppress redundant information. Our theoretical analysis demonstrates the effectiveness of this design in achieving the attention mechanism's purpose. We evaluate SPAN on multiple benchmarks, showing that it outperforms existing efficient super-resolution models in terms of both image quality and inference speed, achieving a significant quality-speed trade-off. This makes SPAN highly suitable for real-world applications, particularly in resource-constrained scenarios. Notably, we won the first place both in the overall performance track and runtime track of the NTIRE 2024 efficient super-resolution challenge. Our code and models are made publicly available at https://github.com/hongyuanyu/SPAN.
