LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification
Judy X Yang, Jun Zhou, Jing Wang, Hui Tian, Alan Wee-Chung Liew
TL;DR
This work addresses the high dimensionality and redundancy of hyperspectral data by introducing a LiDAR-guided cross-attention mechanism to select informative HSI bands for fusion with LiDAR. The method uses a transformer-inspired architecture with self-attention on each modality and a cross-attention step where LiDAR queries weight HSI bands, enabling end-to-end band selection and classification. Across three paired datasets (Houston 2013, Trento, MUUFL), the approach consistently outperforms traditional band-selection baselines and full-band fusion models, particularly when data augmentation is used, achieving state-of-the-art accuracy with a reduced spectral footprint. The results suggest practical benefits for real-time remote sensing applications by reducing data volume without sacrificing, and often enhancing, classification performance.
Abstract
The fusion of hyperspectral and LiDAR data has been an active research topic. Existing fusion methods have ignored the high-dimensionality and redundancy challenges in hyperspectral images, despite that band selection methods have been intensively studied for hyperspectral image (HSI) processing. This paper addresses this significant gap by introducing a cross-attention mechanism from the transformer architecture for the selection of HSI bands guided by LiDAR data. LiDAR provides high-resolution vertical structural information, which can be useful in distinguishing different types of land cover that may have similar spectral signatures but different structural profiles. In our approach, the LiDAR data are used as the "query" to search and identify the "key" from the HSI to choose the most pertinent bands for LiDAR. This method ensures that the selected HSI bands drastically reduce redundancy and computational requirements while working optimally with the LiDAR data. Extensive experiments have been undertaken on three paired HSI and LiDAR data sets: Houston 2013, Trento and MUUFL. The results highlight the superiority of the cross-attention mechanism, underlining the enhanced classification accuracy of the identified HSI bands when fused with the LiDAR features. The results also show that the use of fewer bands combined with LiDAR surpasses the performance of state-of-the-art fusion models.
