AMANet: Advancing SAR Ship Detection with Adaptive Multi-Hierarchical Attention Network
Xiaolin Ma, Junkai Cheng, Aihua Li, Yuhua Zhang, Zhilong Lin
TL;DR
This work tackles the difficult problem of detecting ships in SAR imagery, focusing on small and coastal vessels where clutter and limited features hinder performance. It introduces AMANet, a plug-and-play detector built around the adaptive multi-hierarchical attention module (AMAM), which comprises a Multi-hierarchical Enhanced (ME) block for multi-scale feature fusion and an Adaptive Attention (AA) block for channel-wise, head-wise attention with learnable aggregation. The ME and AA blocks enable robust multi-scale feature aggregation and diverse attention maps, improving detection accuracy across SSDD and HRSID datasets and outperforming state-of-the-art methods, including inshore and offshore scenarios and across multiple YOLO backbones. These results demonstrate AMANet’s potential for practical SAR ship detection in cluttered coastal environments, with future work extending AMAM to Transformer-based backbones to further enhance performance.
Abstract
Recently, methods based on deep learning have been successfully applied to ship detection for synthetic aperture radar (SAR) images. Despite the development of numerous ship detection methodologies, detecting small and coastal ships remains a significant challenge due to the limited features and clutter in coastal environments. For that, a novel adaptive multi-hierarchical attention module (AMAM) is proposed to learn multi-scale features and adaptively aggregate salient features from various feature layers, even in complex environments. Specifically, we first fuse information from adjacent feature layers to enhance the detection of smaller targets, thereby achieving multi-scale feature enhancement. Then, to filter out the adverse effects of complex backgrounds, we dissect the previously fused multi-level features on the channel, individually excavate the salient regions, and adaptively amalgamate features originating from different channels. Thirdly, we present a novel adaptive multi-hierarchical attention network (AMANet) by embedding the AMAM between the backbone network and the feature pyramid network (FPN). Besides, the AMAM can be readily inserted between different frameworks to improve object detection. Lastly, extensive experiments on two large-scale SAR ship detection datasets demonstrate that our AMANet method is superior to state-of-the-art methods.
