Table of Contents
Fetching ...

Pyramid Hierarchical Transformer for Hyperspectral Image Classification

Muhammad Ahmad, Muhammad Hassaan Farooq Butt, Manuel Mazzara, Salvatore Distifano

TL;DR

This innovative approach organizes input data hierarchically into pyramid segments, each representing distinct abstraction levels, thereby enhancing processing efficiency and highlighting the potential of PyFormer in advancing hyperspectral image classification (HSIC).

Abstract

The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer). This innovative approach organizes input data hierarchically into segments, each representing distinct abstraction levels, thereby enhancing processing efficiency for lengthy sequences. At each level, a dedicated transformer module is applied, effectively capturing both local and global context. Spatial and spectral information flow within the hierarchy facilitates communication and abstraction propagation. Integration of outputs from different levels culminates in the final input representation. Experimental results underscore the superiority of the proposed method over traditional approaches. Additionally, the incorporation of disjoint samples augments robustness and reliability, thereby highlighting the potential of our approach in advancing HSIC. The source code is available at https://github.com/mahmad00/PyFormer.

Pyramid Hierarchical Transformer for Hyperspectral Image Classification

TL;DR

This innovative approach organizes input data hierarchically into pyramid segments, each representing distinct abstraction levels, thereby enhancing processing efficiency and highlighting the potential of PyFormer in advancing hyperspectral image classification (HSIC).

Abstract

The traditional Transformer model encounters challenges with variable-length input sequences, particularly in Hyperspectral Image Classification (HSIC), leading to efficiency and scalability concerns. To overcome this, we propose a pyramid-based hierarchical transformer (PyFormer). This innovative approach organizes input data hierarchically into segments, each representing distinct abstraction levels, thereby enhancing processing efficiency for lengthy sequences. At each level, a dedicated transformer module is applied, effectively capturing both local and global context. Spatial and spectral information flow within the hierarchy facilitates communication and abstraction propagation. Integration of outputs from different levels culminates in the final input representation. Experimental results underscore the superiority of the proposed method over traditional approaches. Additionally, the incorporation of disjoint samples augments robustness and reliability, thereby highlighting the potential of our approach in advancing HSIC. The source code is available at https://github.com/mahmad00/PyFormer.
Paper Structure (4 sections, 4 equations, 2 figures, 6 tables)

This paper contains 4 sections, 4 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: The Pyramid-type Hierarchical Transformer features a structured pyramid block comprising two levels. Each level incorporates two convolutional layers with filter sizes of ($32 \times S \times S \times B$) and ($64 \times S \times S \times B$), respectively, followed by a residual connection and activation function. The hierarchical transformer block, receiving the learned multi-scale information, consists of two layers and four multi-heads. The acquired information undergoes flattening and is subsequently subjected to the ReLU activation function and L2 regularization technique. This regularization aids in mitigating overfitting by reducing weights, rendering the network less responsive to minor input variations. Finally, the output layer employs softmax activation for final maps.
  • Figure 2: The PyFormer model achieves kappa accuracies of 99.73%, 99.84%, and 98.01% on the Disjoint Training, Validation, and Test samples for the Pavia University (PU), Salinas (SA), and University of Houston (UH) datasets, respectively.