Table of Contents
Fetching ...

LGEST: Dynamic Spatial-Spectral Expert Routing for Hyperspectral Image Classification

Jiawen Wen, Suixuan Qiu, Zihang Luo, Xiaofei Yang, Haotian Shi

Abstract

Deep learning methods, including Convolutional Neural Networks, Transformers and Mamba, have achieved remarkable success in hyperspectral image (HSI) classification. Nevertheless, existing methods exhibit inflexible integration of local-global representations, inadequate handling of spectral-spatial scale disparities across heterogeneous bands, and susceptibility to the Hughes phenomenon under high-dimensional sample heterogeneity. To address these challenges, we propose Local-Global Expert Spatial-Spectral Transformer (LGEST), a novel framework that synergistically combines three key innovations. The LGEST first employs a Deep Spatial-Spectral Autoencoder (DSAE) to generate compact yet discriminative embeddings through hierarchical nonlinear compression, preserving 3D neighborhood coherence while mitigating information loss in high-dimensional spaces. Secondly, a Cross-Interactive Mixed Expert Feature Pyramid (CIEM-FPN) leverages cross-attention mechanisms and residual mixture-of-experts layers to dynamically fuse multi-scale features, adaptively weighting spectral discriminability and spatial saliency through learnable gating functions. Finally, a Local-Global Expert System (LGES) processes decomposed features via sparsely activated expert pairs: convolutional sub-experts capture fine-grained textures, while transformer sub-experts model long-range contextual dependencies, with a routing controller dynamically selecting experts based on real-time feature saliency. Extensive experiments on four benchmark datasets demonstrate that LGEST consistently outperforms state-of-the-art methods.

LGEST: Dynamic Spatial-Spectral Expert Routing for Hyperspectral Image Classification

Abstract

Deep learning methods, including Convolutional Neural Networks, Transformers and Mamba, have achieved remarkable success in hyperspectral image (HSI) classification. Nevertheless, existing methods exhibit inflexible integration of local-global representations, inadequate handling of spectral-spatial scale disparities across heterogeneous bands, and susceptibility to the Hughes phenomenon under high-dimensional sample heterogeneity. To address these challenges, we propose Local-Global Expert Spatial-Spectral Transformer (LGEST), a novel framework that synergistically combines three key innovations. The LGEST first employs a Deep Spatial-Spectral Autoencoder (DSAE) to generate compact yet discriminative embeddings through hierarchical nonlinear compression, preserving 3D neighborhood coherence while mitigating information loss in high-dimensional spaces. Secondly, a Cross-Interactive Mixed Expert Feature Pyramid (CIEM-FPN) leverages cross-attention mechanisms and residual mixture-of-experts layers to dynamically fuse multi-scale features, adaptively weighting spectral discriminability and spatial saliency through learnable gating functions. Finally, a Local-Global Expert System (LGES) processes decomposed features via sparsely activated expert pairs: convolutional sub-experts capture fine-grained textures, while transformer sub-experts model long-range contextual dependencies, with a routing controller dynamically selecting experts based on real-time feature saliency. Extensive experiments on four benchmark datasets demonstrate that LGEST consistently outperforms state-of-the-art methods.
Paper Structure (23 sections, 18 equations, 9 figures, 8 tables, 1 algorithm)

This paper contains 23 sections, 18 equations, 9 figures, 8 tables, 1 algorithm.

Figures (9)

  • Figure 1: The architecture of proposed LGEST. The network processes input data with a deep spatial-spectral auto-encoder (DSAE), which serves as a local feature extractor. It then uses a cross-interactive mixed expert feature pyramid to fuse features across different scales via parallel and downsampling branches, enhancing interactions through the Cross-Interactive mixed Expert Feature Pyramid (CIEM-FPN). This is followed by the local-global expert system (LGES), divided into two expert groups handling local and global information. The router selects the activated expert from sub-experts within each group, which are linear layers.
  • Figure 2: The structure of the cross-interactive mixed expert feature pyramid (CIEM-FPN). (a) shows the overall framework of CIEM-FPN, which incorporates a novel cross-interactive mixed expert module (CIEM) that enhances model comprehension and generation by creating associations between different input sequences.(b) presents the detailed structure of the CIEM module. To enable cross-interactive functionality, these inputs can originate from both the parallel structure and the upper output. (c) illustrates the RMoE layer, in which the experts are learnable matrices. (d) illustrates the cross-attention mechanism employed in CIEM.
  • Figure 3: Classification map of different methods on the IndianPines dataset (with 10% training samples).
  • Figure 4: Classification map of different methods on the KSC dataset (with 10% training samples).
  • Figure 5: Classification map of different methods on the WHU-Hi-LongKou dataset (with 0.1% training samples).
  • ...and 4 more figures