Table of Contents
Fetching ...

Multitask Learning for SAR Ship Detection with Gaussian-Mask Joint Segmentation

Ming Zhao, Xin Zhang, André Kaup

TL;DR

A multitask learning framework for object detection (MLDet) detect ships in SAR images using an angle classification loss with aspect ratio weighting (ARW), a dual-feature fusion attention (DFA) mechanism to suppress noisy background information and fuse shallow features and denoising features, which helps MLDet be more robust to speckle noise.

Abstract

Detecting ships in synthetic aperture radar (SAR) images is challenging due to strong speckle noise, complex surroundings, and varying scales. This paper proposes MLDet, a multitask learning framework for SAR ship detection, consisting of object detection, speckle suppression, and target segmentation tasks. An angle classification loss with aspect ratio weighting is introduced to improve detection accuracy by addressing angular periodicity and object proportions. The speckle suppression task uses a dual-feature fusion attention mechanism to reduce noise and fuse shallow and denoising features, enhancing robustness. The target segmentation task, leveraging a rotated Gaussian-mask, aids the network in extracting target regions from cluttered backgrounds and improves detection efficiency with pixel-level predictions. The Gaussian-mask ensures ship centers have the highest probabilities, gradually decreasing outward under a Gaussian distribution. Additionally, a weighted rotated boxes fusion (WRBF) strategy combines multi-direction anchor predictions, filtering anchors beyond boundaries or with high overlap but low confidence. Extensive experiments on SSDD+ and HRSID datasets demonstrate the effectiveness and superiority of MLDet.

Multitask Learning for SAR Ship Detection with Gaussian-Mask Joint Segmentation

TL;DR

A multitask learning framework for object detection (MLDet) detect ships in SAR images using an angle classification loss with aspect ratio weighting (ARW), a dual-feature fusion attention (DFA) mechanism to suppress noisy background information and fuse shallow features and denoising features, which helps MLDet be more robust to speckle noise.

Abstract

Detecting ships in synthetic aperture radar (SAR) images is challenging due to strong speckle noise, complex surroundings, and varying scales. This paper proposes MLDet, a multitask learning framework for SAR ship detection, consisting of object detection, speckle suppression, and target segmentation tasks. An angle classification loss with aspect ratio weighting is introduced to improve detection accuracy by addressing angular periodicity and object proportions. The speckle suppression task uses a dual-feature fusion attention mechanism to reduce noise and fuse shallow and denoising features, enhancing robustness. The target segmentation task, leveraging a rotated Gaussian-mask, aids the network in extracting target regions from cluttered backgrounds and improves detection efficiency with pixel-level predictions. The Gaussian-mask ensures ship centers have the highest probabilities, gradually decreasing outward under a Gaussian distribution. Additionally, a weighted rotated boxes fusion (WRBF) strategy combines multi-direction anchor predictions, filtering anchors beyond boundaries or with high overlap but low confidence. Extensive experiments on SSDD+ and HRSID datasets demonstrate the effectiveness and superiority of MLDet.

Paper Structure

This paper contains 23 sections, 11 equations, 17 figures, 8 tables, 1 algorithm.

Figures (17)

  • Figure 1: Examples of SAR images. (a) SAR images with complex backgrounds. (b) SAR images with multiscale ship targets.
  • Figure 2: The architecture of the proposed MLDet including, object detection module, denoised feature fusion module, and target segmentation module.
  • Figure 3: The definition of rotated rectangle $\theta$ .
  • Figure 4: A demonstration of the limitations of the long-edge definition method for square-like boxes, which have a high IoU but a large training loss due to their angle mismatch. (a) Ground truth. (b) Prediction boxes.
  • Figure 5: Illustration for dual-feature fusion attention mechanism
  • ...and 12 more figures