Enhanced Multi-level Features for Very High Resolution Remote Sensing Scene Classification

Chiranjibi Sitaula; Sumesh KC; Jagannath Aryal

Enhanced Multi-level Features for Very High Resolution Remote Sensing Scene Classification

Chiranjibi Sitaula, Sumesh KC, Jagannath Aryal

TL;DR

This work tackles very high-resolution remote sensing scene classification where inter-class similarity and intra-class variability hinder consistency. It introduces an Enhanced VHR attention module (EAM) that fuses an improved CBAM-based upper path, a middle path for parallel/sequential attention, and a lower path that preserves convolutional information, followed by ASPP and GAP for multi-level feature fusion. The method achieves high accuracies on AID (95.39%) and NWPU-RESISC45 (93.04%) with ultra-low variance (0.001), demonstrating strong stability and generalization. These results highlight the value of multi-scale, attention-driven feature fusion for VHR RS and suggest that the approach can extend to other backbones and datasets.

Abstract

Very high-resolution (VHR) remote sensing (RS) scene classification is a challenging task due to the higher inter-class similarity and intra-class variability problems. Recently, the existing deep learning (DL)-based methods have shown great promise in VHR RS scene classification. However, they still provide an unstable classification performance. To address such a problem, we, in this letter, propose a novel DL-based approach. For this, we devise an enhanced VHR attention module (EAM), followed by the atrous spatial pyramid pooling (ASPP) and global average pooling (GAP). This procedure imparts the enhanced features from the corresponding level. Then, the multi-level feature fusion is performed. Experimental results on two widely-used VHR RS datasets show that the proposed approach yields a competitive and stable/robust classification performance with the least standard deviation of 0.001. Further, the highest overall accuracies on the AID and the NWPU datasets are 95.39% and 93.04%, respectively.

Enhanced Multi-level Features for Very High Resolution Remote Sensing Scene Classification

TL;DR

Abstract

Paper Structure (25 sections, 13 equations, 7 figures, 5 tables)

This paper contains 25 sections, 13 equations, 7 figures, 5 tables.

Introduction
Proposed approach
Feature extraction
EAM
Dimension reduction block
Upper block
Middle block
Lower block
Feature fusion
Experiment and analysis
Datasets
Implementation
Results and discussion
Comparison with the SOTA methods
Ablation study of three components
...and 10 more sections

Figures (7)

Figure 1: High-level pipeline of the proposed approach. Note that $S_2$, $S_3$, $S_4$, and $S_5$ denote the chosen layers from the second, third, fourth, and fifth stages, respectively.
Figure 2: The pipeline of Enhanced VHR attention module (EAM).
Figure 3: Sample images for three different categories belonging to AID and NWPU datasets
Figure 4: Training and validation curve produced while implementing our proposed approach on the AID (a) and NWPU (b) datasets.
Figure 5: Visualisation of discriminative regions produced by the ResNet-50 and our proposed method (EAM) with the help of Grad-CAM visualisation selvaraju2017grad.
...and 2 more figures

Enhanced Multi-level Features for Very High Resolution Remote Sensing Scene Classification

TL;DR

Abstract

Enhanced Multi-level Features for Very High Resolution Remote Sensing Scene Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (7)