Efficient Dynamic Attention 3D Convolution for Hyperspectral Image Classification
Guandong Li, Mengxia Ye
TL;DR
The paper addresses hyperspectral image classification by tackling underutilization of joint spatial–spectral information and redundancy in high-dimensional spectra. It introduces Dynamic Attention Convolution (DAC), which uses K parallel kernels weighted by input-dependent attention to adaptively emphasize spatial structures and selectively discriminate spectral bands, without increasing network depth or width. DAC is integrated into an improved 3D-DenseNet (DACNet) with exponentially growing growth rates and full dense connectivity, yielding a lightweight yet expressive architecture that achieves state-of-the-art accuracy on Indian Pines, Pavia University, and Kennedy Space Center datasets while maintaining low computational cost. The work presents a practical, flexible approach to efficient hyperspectral feature extraction with broad applicability to CNN-based remote sensing pipelines and real-time classification scenarios.
Abstract
Deep neural networks face several challenges in hyperspectral image classification, including insufficient utilization of joint spatial-spectral information, gradient vanishing with increasing depth, and overfitting. To enhance feature extraction efficiency while skipping redundant information, this paper proposes a dynamic attention convolution design based on an improved 3D-DenseNet model. The design employs multiple parallel convolutional kernels instead of a single kernel and assigns dynamic attention weights to these parallel convolutions. This dynamic attention mechanism achieves adaptive feature response based on spatial characteristics in the spatial dimension of hyperspectral images, focusing more on key spatial structures. In the spectral dimension, it enables dynamic discrimination of different bands, alleviating information redundancy and computational complexity caused by high spectral dimensionality. The DAC module enhances model representation capability by attention-based aggregation of multiple convolutional kernels without increasing network depth or width. The proposed method demonstrates superior performance in both inference speed and accuracy, outperforming mainstream hyperspectral image classification methods on the IN, UP, and KSC datasets.
