Enhancing Ship Classification in Optical Satellite Imagery: Integrating Convolutional Block Attention Module with ResNet for Improved Performance
Ryan Donghan Kwon, Gangjoo Robin Nam, Jisoo Tak, Junseob Shin, Hyerin Cha, Seung Won Lee
TL;DR
This work investigates ship classification in high-resolution optical satellite imagery by coupling a ResNet50 backbone with a Convolutional Block Attention Module (CBAM) and additional architectural enhancements. Through transfer learning, multiscale feature integration, depthwise separable, and dilated convolutions, the proposed model improves from 0.87 accuracy (ResNet50+CBAM) to 0.95 accuracy on four main ship classes, with notable gains in precision and recall across classes. Attention heatmaps validate that the enhanced CBAM directs focus to ship features despite background complexity, underscoring the interpretability and robustness of the approach. The study also discusses limitations such as data imbalance and computational costs, and points to future work in scalability and semi-supervised learning to facilitate recognition of new or rare ship types in satellite imagery.
Abstract
In this study, we present an advanced convolutional neural network (CNN) architecture for ship classification based on optical satellite imagery, which significantly enhances performance through the integration of a convolutional block attention module (CBAM) and additional architectural innovations. Building upon the foundational ResNet50 model, we first incorporated a standard CBAM to direct the model's focus toward more informative features, achieving an accuracy of 87% compared to 85% of the baseline ResNet50. Further augmentations involved multiscale feature integration, depthwise separable convolutions, and dilated convolutions, culminating in an enhanced ResNet model with improved CBAM. This model demonstrated a remarkable accuracy of 95%, with precision, recall, and F1 scores all witnessing substantial improvements across various ship classes. In particular, the bulk carrier and oil tanker classes exhibited nearly perfect precision and recall rates, underscoring the enhanced capability of the model to accurately identify and classify ships. Attention heatmap analyses further validated the efficacy of the improved model, revealing more focused attention on relevant ship features regardless of background complexities. These findings underscore the potential of integrating attention mechanisms and architectural innovations into CNNs for high-resolution satellite imagery classification. This study navigates through the class imbalance and computational costs and proposes future directions for scalability and adaptability in new or rare ship-type recognition. This study lays the groundwork for applying advanced deep learning techniques in remote sensing, offering insights into scalable and efficient satellite image classification.
