Table of Contents
Fetching ...

Enhancing Maritime Object Detection in Real-Time with RT-DETR and Data Augmentation

Nader Nemati

TL;DR

This work tackles maritime object detection under small target sizes and limited labeled RGB data by adapting RT-DETR for real-time, end-to-end detection. It introduces multi-scale fusion, uncertainty-guided query initialization, and a domain-aware weighting scheme to balance synthetic and real samples, while ensuring evaluation is conducted solely on real imagery. A comprehensive component analysis demonstrates that each module contributes to improved performance, with fusion providing the largest gain and the full configuration achieving a mAP@0.5 of 0.89. Experiments on the TDSS-G1 dataset show robust real-time performance and better handling of minority classes through targeted augmentation, highlighting practical impact for coastal surveillance and naval safety scenarios.

Abstract

Maritime object detection faces essential challenges due to the small target size and limitations of labeled real RGB data. This paper will present a real-time object detection system based on RT-DETR, enhanced by employing augmented synthetic images while strictly evaluating on real data. This study employs RT-DETR for the maritime environment by combining multi-scale feature fusion, uncertainty-minimizing query selection, and smart weight between synthetic and real training samples. The fusion module in DETR enhances the detection of small, low-contrast vessels, query selection focuses on the most reliable proposals, and the weighting strategy helps reduce the visual gap between synthetic and real domains. This design preserves DETR's refined end-to-end set prediction while allowing users to adjust between speed and accuracy at inference time. Data augmentation techniques were also used to balance the different classes of the dataset to improve the robustness and accuracy of the model. Regarding this study, a full Python robust maritime detection pipeline is delivered that maintains real-time performance even under practical limits. It also verifies how each module contributes, and how the system handles failures in extreme lighting or sea conditions. This study also includes a component analysis to quantify the contribution of each architectural module and explore its interactions.

Enhancing Maritime Object Detection in Real-Time with RT-DETR and Data Augmentation

TL;DR

This work tackles maritime object detection under small target sizes and limited labeled RGB data by adapting RT-DETR for real-time, end-to-end detection. It introduces multi-scale fusion, uncertainty-guided query initialization, and a domain-aware weighting scheme to balance synthetic and real samples, while ensuring evaluation is conducted solely on real imagery. A comprehensive component analysis demonstrates that each module contributes to improved performance, with fusion providing the largest gain and the full configuration achieving a mAP@0.5 of 0.89. Experiments on the TDSS-G1 dataset show robust real-time performance and better handling of minority classes through targeted augmentation, highlighting practical impact for coastal surveillance and naval safety scenarios.

Abstract

Maritime object detection faces essential challenges due to the small target size and limitations of labeled real RGB data. This paper will present a real-time object detection system based on RT-DETR, enhanced by employing augmented synthetic images while strictly evaluating on real data. This study employs RT-DETR for the maritime environment by combining multi-scale feature fusion, uncertainty-minimizing query selection, and smart weight between synthetic and real training samples. The fusion module in DETR enhances the detection of small, low-contrast vessels, query selection focuses on the most reliable proposals, and the weighting strategy helps reduce the visual gap between synthetic and real domains. This design preserves DETR's refined end-to-end set prediction while allowing users to adjust between speed and accuracy at inference time. Data augmentation techniques were also used to balance the different classes of the dataset to improve the robustness and accuracy of the model. Regarding this study, a full Python robust maritime detection pipeline is delivered that maintains real-time performance even under practical limits. It also verifies how each module contributes, and how the system handles failures in extreme lighting or sea conditions. This study also includes a component analysis to quantify the contribution of each architectural module and explore its interactions.

Paper Structure

This paper contains 14 sections, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Detailed architecture of the RT-DETR model.
  • Figure 2: Overview of the RT-DETR maritime ship detection pipeline. From left to right: raw and synthetic data preparation → conversion and normalization of annotations (YOLO to COCO patching) → model training with RT-DETR → evaluation → inference and result visualization → final performance reporting.
  • Figure 3: Examples from TDSS-G1: real vs synthetic transformations.
  • Figure 4: Augmentation examples used for minority classes.
  • Figure 5: Representative detection outcomes on real maritime images.
  • ...and 3 more figures