Table of Contents
Fetching ...

NBBOX: Noisy Bounding Box Improves Remote Sensing Object Detection

Yechan Kim, SooYeon Kim, Moongu Jeon

TL;DR

This work addresses annotation noise and limited data in remote sensing object detection by introducing NBBOX, a bounding-box level augmentation that injects noise through scaling, rotation, and translation of oriented boxes $(x_c, y_c, w, h, \theta)$. It provides both a straightforward implementation and optional scale-aware mechanisms that skip tiny boxes via a threshold $\gamma$, improving robustness while remaining computationally efficient. Through experiments on DOTA and DIOR-R with rotated Faster R-CNN and FCOS, NBBOX demonstrates consistent gains over image-level augmentations and offers significant training-time efficiency, proving especially suitable for aerial imagery where bounding box annotations may be imperfect. The method is lightweight, integrable with common detection frameworks, and supported by open-source code to facilitate adoption and replication across remote sensing workloads.

Abstract

Data augmentation has shown significant advancements in computer vision to improve model performance over the years, particularly in scenarios with limited and insufficient data. Currently, most studies focus on adjusting the image or its features to expand the size, quality, and variety of samples during training in various tasks including object detection. However, we argue that it is necessary to investigate bounding box transformations as a data augmentation technique rather than image-level transformations, especially in aerial imagery due to potentially inconsistent bounding box annotations. Hence, this letter presents a thorough investigation of bounding box transformation in terms of scaling, rotation, and translation for remote sensing object detection. We call this augmentation strategy NBBOX (Noise Injection into Bounding Box). We conduct extensive experiments on DOTA and DIOR-R, both well-known datasets that include a variety of rotated generic objects in aerial images. Experimental results show that our approach significantly improves remote sensing object detection without whistles and bells and it is more time-efficient than other state-of-the-art augmentation strategies.

NBBOX: Noisy Bounding Box Improves Remote Sensing Object Detection

TL;DR

This work addresses annotation noise and limited data in remote sensing object detection by introducing NBBOX, a bounding-box level augmentation that injects noise through scaling, rotation, and translation of oriented boxes . It provides both a straightforward implementation and optional scale-aware mechanisms that skip tiny boxes via a threshold , improving robustness while remaining computationally efficient. Through experiments on DOTA and DIOR-R with rotated Faster R-CNN and FCOS, NBBOX demonstrates consistent gains over image-level augmentations and offers significant training-time efficiency, proving especially suitable for aerial imagery where bounding box annotations may be imperfect. The method is lightweight, integrable with common detection frameworks, and supported by open-source code to facilitate adoption and replication across remote sensing workloads.

Abstract

Data augmentation has shown significant advancements in computer vision to improve model performance over the years, particularly in scenarios with limited and insufficient data. Currently, most studies focus on adjusting the image or its features to expand the size, quality, and variety of samples during training in various tasks including object detection. However, we argue that it is necessary to investigate bounding box transformations as a data augmentation technique rather than image-level transformations, especially in aerial imagery due to potentially inconsistent bounding box annotations. Hence, this letter presents a thorough investigation of bounding box transformation in terms of scaling, rotation, and translation for remote sensing object detection. We call this augmentation strategy NBBOX (Noise Injection into Bounding Box). We conduct extensive experiments on DOTA and DIOR-R, both well-known datasets that include a variety of rotated generic objects in aerial images. Experimental results show that our approach significantly improves remote sensing object detection without whistles and bells and it is more time-efficient than other state-of-the-art augmentation strategies.
Paper Structure (19 sections, 4 equations, 3 figures, 7 tables, 1 algorithm)

This paper contains 19 sections, 4 equations, 3 figures, 7 tables, 1 algorithm.

Figures (3)

  • Figure 1: Examples of the proposed data augmentation method named NBBOX for remote sensing object detection.
  • Figure 2: Comparison between provided bounding box labels and minimum enclosing rectangles for objects on DOTA and DIOR-R.
  • Figure 3: Impact of scaling, rotation, and translation of bounding boxes on model performance: as we observe similar outcomes across different architectures and datasets, for brevity, we include only the result from Faster R-CNN with ResNet-50 trained on DIOR-R.