Table of Contents
Fetching ...

Robust Tiny Object Detection in Aerial Images amidst Label Noise

Haoran Zhu, Chang Xu, Wen Yang, Ruixiang Zhang, Yan Zhang, Gui-Song Xia

TL;DR

The paper tackles robust tiny object detection in aerial imagery under label noise, identifying class shifts and inaccurate bounding boxes as the main degradation modes. It introduces DN-TOD, a plug-and-play framework with a Class-aware Label Correction (CLC) module and a Trend-guided Learning Strategy (TLS) consisting of Trend-guided Label Reweighting (TLR) and Recurrent Box Regeneration (RBR). Through label-noise characterization, synthetic data generation, and extensive experiments on synthetic and real-world datasets, DN-TOD delivers substantial robustness gains for both one-stage and two-stage detectors, including a 4.9-point AP improvement under 40% mixed noise on RFLA baselines. The approach advances practical remote sensing interpretation by enabling reliable tiny-object detection under imperfect annotations and is designed for easy integration into existing detection pipelines.

Abstract

Precise detection of tiny objects in remote sensing imagery remains a significant challenge due to their limited visual information and frequent occurrence within scenes. This challenge is further exacerbated by the practical burden and inherent errors associated with manual annotation: annotating tiny objects is laborious and prone to errors (i.e., label noise). Training detectors for such objects using noisy labels often leads to suboptimal performance, with networks tending to overfit on noisy labels. In this study, we address the intricate issue of tiny object detection under noisy label supervision. We systematically investigate the impact of various types of noise on network training, revealing the vulnerability of object detectors to class shifts and inaccurate bounding boxes for tiny objects. To mitigate these challenges, we propose a DeNoising Tiny Object Detector (DN-TOD), which incorporates a Class-aware Label Correction (CLC) scheme to address class shifts and a Trend-guided Learning Strategy (TLS) to handle bounding box noise. CLC mitigates inaccurate class supervision by identifying and filtering out class-shifted positive samples, while TLS reduces noisy box-induced erroneous supervision through sample reweighting and bounding box regeneration. Additionally, Our method can be seamlessly integrated into both one-stage and two-stage object detection pipelines. Comprehensive experiments conducted on synthetic (i.e., noisy AI-TOD-v2.0 and DOTA-v2.0) and real-world (i.e., AI-TOD) noisy datasets demonstrate the robustness of DN-TOD under various types of label noise. Notably, when applied to the strong baseline RFLA, DN-TOD exhibits a noteworthy performance improvement of 4.9 points under 40% mixed noise. Datasets, codes, and models will be made publicly available.

Robust Tiny Object Detection in Aerial Images amidst Label Noise

TL;DR

The paper tackles robust tiny object detection in aerial imagery under label noise, identifying class shifts and inaccurate bounding boxes as the main degradation modes. It introduces DN-TOD, a plug-and-play framework with a Class-aware Label Correction (CLC) module and a Trend-guided Learning Strategy (TLS) consisting of Trend-guided Label Reweighting (TLR) and Recurrent Box Regeneration (RBR). Through label-noise characterization, synthetic data generation, and extensive experiments on synthetic and real-world datasets, DN-TOD delivers substantial robustness gains for both one-stage and two-stage detectors, including a 4.9-point AP improvement under 40% mixed noise on RFLA baselines. The approach advances practical remote sensing interpretation by enabling reliable tiny-object detection under imperfect annotations and is designed for easy integration into existing detection pipelines.

Abstract

Precise detection of tiny objects in remote sensing imagery remains a significant challenge due to their limited visual information and frequent occurrence within scenes. This challenge is further exacerbated by the practical burden and inherent errors associated with manual annotation: annotating tiny objects is laborious and prone to errors (i.e., label noise). Training detectors for such objects using noisy labels often leads to suboptimal performance, with networks tending to overfit on noisy labels. In this study, we address the intricate issue of tiny object detection under noisy label supervision. We systematically investigate the impact of various types of noise on network training, revealing the vulnerability of object detectors to class shifts and inaccurate bounding boxes for tiny objects. To mitigate these challenges, we propose a DeNoising Tiny Object Detector (DN-TOD), which incorporates a Class-aware Label Correction (CLC) scheme to address class shifts and a Trend-guided Learning Strategy (TLS) to handle bounding box noise. CLC mitigates inaccurate class supervision by identifying and filtering out class-shifted positive samples, while TLS reduces noisy box-induced erroneous supervision through sample reweighting and bounding box regeneration. Additionally, Our method can be seamlessly integrated into both one-stage and two-stage object detection pipelines. Comprehensive experiments conducted on synthetic (i.e., noisy AI-TOD-v2.0 and DOTA-v2.0) and real-world (i.e., AI-TOD) noisy datasets demonstrate the robustness of DN-TOD under various types of label noise. Notably, when applied to the strong baseline RFLA, DN-TOD exhibits a noteworthy performance improvement of 4.9 points under 40% mixed noise. Datasets, codes, and models will be made publicly available.
Paper Structure (22 sections, 11 equations, 9 figures, 6 tables)

This paper contains 22 sections, 11 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: An overview of our method. Observing that the model is quite sensitive to class shifts and inaccurate bounding box noises when training tiny object detectors, we propose to tackle them by Class-aware Label Correction and Trend-guided Learning, respectively. Orange points denote positive samples.
  • Figure 2: (a) illustrates the noisy box labels vs. predicted boxes. The orange box represents inaccurate bounding box annotations. The blue box represents the prediction boxes of the regression branch. The network can still provide relatively accurate regression predictions even under noisy box supervision. (b) stands for the confidence changing trend at different epochs. Clean samples exhibit an upward learning trend, while noisy samples show a downward and constant trend during training.
  • Figure 3: Mean average precision mAP@[.5,.95] of FCOS with RFLA on the synthesized “noisy” AI-TOD-v2.0 dataset, where the annotations are randomly perturbed. With the noise level increases, i.e., annotations become more and more inaccurate, the mAP of class shifts and inaccurate bounding boxes drops significantly while the mAP of missing labels and extra labels still maintains high accuracy.
  • Figure 4: 30% simulated “noisy” AI-TOD-v2.0 dataset (green for original clean labels, red for missing labels, blue for extra labels, yellow for inaccurate bounding boxes, and dark cyan for class shifts).
  • Figure 5: The workflow of Class-aware Label Correction (CLC). The class-aware label correction can be separated into the class confusion state updating and noisy sample filtering process. The DCM is updated by Eq. \ref{['con:update']}, which is updated when a new image comes. For each positive sample, we compare the predicted values of all classes $P$ with the DCM and determine whether this sample is noisy based on Eq.\ref{['con:Gamma']}. If any class prediction $p_i$ simultaneously satisfies the three conditions, the positive sample is considered noisy.
  • ...and 4 more figures