Table of Contents
Fetching ...

Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence

Hongyuan Wang, Lizhi Wang, Jiang Xu, Chang Chen, Xue Hu, Fenglong Song, Youliang Yan

TL;DR

A novel Exhaustive Correlation Transformer (ECT) is proposed that can model exhaustive correlation within HSI by integrating unified spatial-spectral attention and linear dependence, and results indicate that the method achieves state-of-the-art performance.

Abstract

Spectral super-resolution that aims to recover hyperspectral image (HSI) from easily obtainable RGB image has drawn increasing interest in the field of computational photography. The crucial aspect of spectral super-resolution lies in exploiting the correlation within HSIs. However, two types of bottlenecks in existing Transformers limit performance improvement and practical applications. First, existing Transformers often separately emphasize either spatial-wise or spectral-wise correlation, disrupting the 3D features of HSI and hindering the exploitation of unified spatial-spectral correlation. Second, existing self-attention mechanism always establishes full-rank correlation matrix by learning the correlation between pairs of tokens, leading to its inability to describe linear dependence widely existing in HSI among multiple tokens. To address these issues, we propose a novel Exhaustive Correlation Transformer (ECT) for spectral super-resolution. First, we propose a Spectral-wise Discontinuous 3D (SD3D) splitting strategy, which models unified spatial-spectral correlation by integrating spatial-wise continuous splitting strategy and spectral-wise discontinuous splitting strategy. Second, we propose a Dynamic Low-Rank Mapping (DLRM) model, which captures linear dependence among multiple tokens through a dynamically calculated low-rank dependence map. By integrating unified spatial-spectral attention and linear dependence, our ECT can model exhaustive correlation within HSI. The experimental results on both simulated and real data indicate that our method achieves state-of-the-art performance. Codes and pretrained models will be available later.

Learning Exhaustive Correlation for Spectral Super-Resolution: Where Spatial-Spectral Attention Meets Linear Dependence

TL;DR

A novel Exhaustive Correlation Transformer (ECT) is proposed that can model exhaustive correlation within HSI by integrating unified spatial-spectral attention and linear dependence, and results indicate that the method achieves state-of-the-art performance.

Abstract

Spectral super-resolution that aims to recover hyperspectral image (HSI) from easily obtainable RGB image has drawn increasing interest in the field of computational photography. The crucial aspect of spectral super-resolution lies in exploiting the correlation within HSIs. However, two types of bottlenecks in existing Transformers limit performance improvement and practical applications. First, existing Transformers often separately emphasize either spatial-wise or spectral-wise correlation, disrupting the 3D features of HSI and hindering the exploitation of unified spatial-spectral correlation. Second, existing self-attention mechanism always establishes full-rank correlation matrix by learning the correlation between pairs of tokens, leading to its inability to describe linear dependence widely existing in HSI among multiple tokens. To address these issues, we propose a novel Exhaustive Correlation Transformer (ECT) for spectral super-resolution. First, we propose a Spectral-wise Discontinuous 3D (SD3D) splitting strategy, which models unified spatial-spectral correlation by integrating spatial-wise continuous splitting strategy and spectral-wise discontinuous splitting strategy. Second, we propose a Dynamic Low-Rank Mapping (DLRM) model, which captures linear dependence among multiple tokens through a dynamically calculated low-rank dependence map. By integrating unified spatial-spectral attention and linear dependence, our ECT can model exhaustive correlation within HSI. The experimental results on both simulated and real data indicate that our method achieves state-of-the-art performance. Codes and pretrained models will be available later.
Paper Structure (21 sections, 3 equations, 16 figures, 2 tables)

This paper contains 21 sections, 3 equations, 16 figures, 2 tables.

Figures (16)

  • Figure 1: Comparisons of MRAE, inference latency, and parameters on the NTIRE 2022 dataset are presented. The circle radius represents the number of parameters.
  • Figure 2: The macro design of Exhaustive Correlation Transformer (ECT).
  • Figure 3: Micro Design of ECT. (a) Exhaustive Self-Attention Block (ESAB). (b) Feed Forward Network (FFN). (c) Exhaustive Self-Attention (ESA). Key designs in ESA are the Spectral-wise Discontinuous 3D (SD3D) splitting and alignment strategies, the Dynamic Low-Rank Mapping (DLRM) model, and the Unified Spatial-Spectral self-Attention (USSA) model.
  • Figure 4: Detailed design of the Dynamic Low-Rank Mapping (DLRM) module.
  • Figure 5: Comparison of the Attention Map in Self-Attention (SA) and the Dependence Map in DLRM.
  • ...and 11 more figures