Efficient Learnable Collaborative Attention for Single Image Super-Resolution
Yigang Zhao Chaowei Zheng, Jiannan Su, GuangyongChen, MinGan
TL;DR
The paper addresses the heavy computational burden of non-local attention in single-image super-resolution by introducing Learnable Collaborative Attention (LCoA). LCoA combines a Learnable Sparse Pattern (LSP), built on k-means clustering to create data-driven sparse attention, with Collaborative Attention (CoA), which shares attention weights across network layers to reduce redundant computations. Empirical results show substantial inference-time reductions (up to about 83%) and memory savings while maintaining competitive PSNR/SSIM against state-of-the-art SR methods, demonstrated on standard benchmarks with diverse scale factors. The proposed Learnable Sparse Pattern and weight-sharing strategy enable efficient long-range dependency modeling, yielding a deep Learnable Collaborative Attention Network (LCoAN) that balances accuracy and efficiency for practical SR applications.
Abstract
Non-Local Attention (NLA) is a powerful technique for capturing long-range feature correlations in deep single image super-resolution (SR). However, NLA suffers from high computational complexity and memory consumption, as it requires aggregating all non-local feature information for each query response and recalculating the similarity weight distribution for different abstraction levels of features. To address these challenges, we propose a novel Learnable Collaborative Attention (LCoA) that introduces inductive bias into non-local modeling. Our LCoA consists of two components: Learnable Sparse Pattern (LSP) and Collaborative Attention (CoA). LSP uses the k-means clustering algorithm to dynamically adjust the sparse attention pattern of deep features, which reduces the number of non-local modeling rounds compared with existing sparse solutions. CoA leverages the sparse attention pattern and weights learned by LSP, and co-optimizes the similarity matrix across different abstraction levels, which avoids redundant similarity matrix calculations. The experimental results show that our LCoA can reduce the non-local modeling time by about 83% in the inference stage. In addition, we integrate our LCoA into a deep Learnable Collaborative Attention Network (LCoAN), which achieves competitive performance in terms of inference time, memory consumption, and reconstruction quality compared with other state-of-the-art SR methods.
