Table of Contents
Fetching ...

Improving the Detection of Small Oriented Objects in Aerial Images

Chandler Timm C. Doloriel, Rhandley D. Cajote

TL;DR

This work tackles the problem of detecting very small oriented objects in aerial images, where traditional axis-aligned detectors struggle to localize objects precisely. It introduces the Attention-Points Network, a two-stage detector that incorporates efficient self-attention and two novel losses: Guided-Attention Loss (GALoss) to align attention features with coarse object masks, and Box-Points Loss (BPLoss) to score box-points relative to the target oriented bounding box using a differentiable sigmoid kernel. The approach yields improvements over baselines on DOTA-v1.5 and HRSC2016, with notable gains for the smallest instances and across higher IoU thresholds, as demonstrated by ablation studies. The method provides a practical advancement for small, oriented-object detection in aerial imagery and comes with publicly available code.

Abstract

Small oriented objects that represent tiny pixel-area in large-scale aerial images are difficult to detect due to their size and orientation. Existing oriented aerial detectors have shown promising results but are mainly focused on orientation modeling with less regard to the size of the objects. In this work, we proposed a method to accurately detect small oriented objects in aerial images by enhancing the classification and regression tasks of the oriented object detection model. We designed the Attention-Points Network consisting of two losses: Guided-Attention Loss (GALoss) and Box-Points Loss (BPLoss). GALoss uses an instance segmentation mask as ground-truth to learn the attention features needed to improve the detection of small objects. These attention features are then used to predict box points for BPLoss, which determines the points' position relative to the target oriented bounding box. Experimental results show the effectiveness of our Attention-Points Network on a standard oriented aerial dataset with small object instances (DOTA-v1.5) and on a maritime-related dataset (HRSC2016). The code is publicly available.

Improving the Detection of Small Oriented Objects in Aerial Images

TL;DR

This work tackles the problem of detecting very small oriented objects in aerial images, where traditional axis-aligned detectors struggle to localize objects precisely. It introduces the Attention-Points Network, a two-stage detector that incorporates efficient self-attention and two novel losses: Guided-Attention Loss (GALoss) to align attention features with coarse object masks, and Box-Points Loss (BPLoss) to score box-points relative to the target oriented bounding box using a differentiable sigmoid kernel. The approach yields improvements over baselines on DOTA-v1.5 and HRSC2016, with notable gains for the smallest instances and across higher IoU thresholds, as demonstrated by ablation studies. The method provides a practical advancement for small, oriented-object detection in aerial imagery and comes with publicly available code.

Abstract

Small oriented objects that represent tiny pixel-area in large-scale aerial images are difficult to detect due to their size and orientation. Existing oriented aerial detectors have shown promising results but are mainly focused on orientation modeling with less regard to the size of the objects. In this work, we proposed a method to accurately detect small oriented objects in aerial images by enhancing the classification and regression tasks of the oriented object detection model. We designed the Attention-Points Network consisting of two losses: Guided-Attention Loss (GALoss) and Box-Points Loss (BPLoss). GALoss uses an instance segmentation mask as ground-truth to learn the attention features needed to improve the detection of small objects. These attention features are then used to predict box points for BPLoss, which determines the points' position relative to the target oriented bounding box. Experimental results show the effectiveness of our Attention-Points Network on a standard oriented aerial dataset with small object instances (DOTA-v1.5) and on a maritime-related dataset (HRSC2016). The code is publicly available.
Paper Structure (19 sections, 7 equations, 6 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 7 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: Architecture of Attention-Points Network.
  • Figure 2: Illustration of Guided-Attention Loss. Input RoI is transformed into three vectors Queries (Q), Keys (K), and Values (V), then processed by a self-attention network to obtain attention features (x) that are compared to object masks using Guided-Attention Loss.
  • Figure 3: General idea of Box-Points Loss. ${T_1, T_2, T_3, T_4}$ are the triangles formed when the edges of the OBB are connected with the box-point at ${(i,j)}$.
  • Figure 4: Visualization of detection results on DOTA-v1.5 dataset.
  • Figure 5: Visualization of detection results on HRSC2016 dataset. Ships are either in the sea or inshore.
  • ...and 1 more figures