Table of Contents
Fetching ...

Denoising-Enhanced YOLO for Robust SAR Ship Detection

Xiaojing Zhao, Shiyang Li, Zena Chu, Ying Zhang, Peinan Hao, Tianzi Yan, Jiajia Chen, Huicong Ning

TL;DR

CPN-YOLO, a high-precision ship detection framework built upon YOLOv8 with three targeted improvements, introduces a learnable large-kernel denoising module for input pre-processing, producing cleaner representations and more discriminative features across diverse ship types.

Abstract

With the rapid advancement of deep learning, synthetic aperture radar (SAR) imagery has become a key modality for ship detection. However, robust performance remains challenging in complex scenes, where clutter and speckle noise can induce false alarms and small targets are easily missed. To address these issues, we propose CPN-YOLO, a high-precision ship detection framework built upon YOLOv8 with three targeted improvements. First, we introduce a learnable large-kernel denoising module for input pre-processing, producing cleaner representations and more discriminative features across diverse ship types. Second, we design a feature extraction enhancement strategy based on the PPA attention mechanism to strengthen multi-scale modeling and improve sensitivity to small ships. Third, we incorporate a Gaussian similarity loss derived from the normalized Wasserstein distance (NWD) to better measure similarity under complex bounding-box distributions and improve generalization. Extensive experiments on HRSID and SSDD demonstrate the effectiveness of our method. On SSDD, CPN-YOLO surpasses the YOLOv8 baseline, achieving 97.0% precision, 95.1% recall, and 98.9% mAP, and consistently outperforms other representative deep-learning detectors in overall performance.

Denoising-Enhanced YOLO for Robust SAR Ship Detection

TL;DR

CPN-YOLO, a high-precision ship detection framework built upon YOLOv8 with three targeted improvements, introduces a learnable large-kernel denoising module for input pre-processing, producing cleaner representations and more discriminative features across diverse ship types.

Abstract

With the rapid advancement of deep learning, synthetic aperture radar (SAR) imagery has become a key modality for ship detection. However, robust performance remains challenging in complex scenes, where clutter and speckle noise can induce false alarms and small targets are easily missed. To address these issues, we propose CPN-YOLO, a high-precision ship detection framework built upon YOLOv8 with three targeted improvements. First, we introduce a learnable large-kernel denoising module for input pre-processing, producing cleaner representations and more discriminative features across diverse ship types. Second, we design a feature extraction enhancement strategy based on the PPA attention mechanism to strengthen multi-scale modeling and improve sensitivity to small ships. Third, we incorporate a Gaussian similarity loss derived from the normalized Wasserstein distance (NWD) to better measure similarity under complex bounding-box distributions and improve generalization. Extensive experiments on HRSID and SSDD demonstrate the effectiveness of our method. On SSDD, CPN-YOLO surpasses the YOLOv8 baseline, achieving 97.0% precision, 95.1% recall, and 98.9% mAP, and consistently outperforms other representative deep-learning detectors in overall performance.
Paper Structure (18 sections, 16 equations, 9 figures, 6 tables)

This paper contains 18 sections, 16 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Visualization of motivation. From (a) to (c) represent in turn the 3 challenges of existing methods. (a) Weak features, noise and uneven illumination in low light environment. (b) Down-sampling causes information loss of small targets and is easy to drown in the background. (c) Small target pixels are easy to cause false negative samples and training imbalance.
  • Figure 2: Different types of detectors. From (a) to (c) represent two-stage detector, one-stage detector and anchor-based detector.
  • Figure 3: The network architecture of CPN-YOLO. The CBL represents Conv, Batch normalization and SiLU. SPAF denotes SPPF followed by PPA. Among the framework, PPA and CID are the improvements we made.
  • Figure 4: Detailed architectures of introduced task-specific blocks: (a) Channel-Independent Denoising(CID) module. (b) Original CID block.
  • Figure 5: Structure of the Parallelized Patch-Aware Attention Module. It is composed of two core components: a multi-branch fusion unit and an attention mechanism. The multi-branch fusion unit integrates two parallel operations: patch-aware convolution and concatenated convolution. In the patch-aware convolution, the hyperparameter pis configured to 2 and 4, corresponding to local and global receptive field branches, respectively.
  • ...and 4 more figures