Robust Tiny Object Detection in Aerial Images amidst Label Noise
Haoran Zhu, Chang Xu, Wen Yang, Ruixiang Zhang, Yan Zhang, Gui-Song Xia
TL;DR
The paper tackles robust tiny object detection in aerial imagery under label noise, identifying class shifts and inaccurate bounding boxes as the main degradation modes. It introduces DN-TOD, a plug-and-play framework with a Class-aware Label Correction (CLC) module and a Trend-guided Learning Strategy (TLS) consisting of Trend-guided Label Reweighting (TLR) and Recurrent Box Regeneration (RBR). Through label-noise characterization, synthetic data generation, and extensive experiments on synthetic and real-world datasets, DN-TOD delivers substantial robustness gains for both one-stage and two-stage detectors, including a 4.9-point AP improvement under 40% mixed noise on RFLA baselines. The approach advances practical remote sensing interpretation by enabling reliable tiny-object detection under imperfect annotations and is designed for easy integration into existing detection pipelines.
Abstract
Precise detection of tiny objects in remote sensing imagery remains a significant challenge due to their limited visual information and frequent occurrence within scenes. This challenge is further exacerbated by the practical burden and inherent errors associated with manual annotation: annotating tiny objects is laborious and prone to errors (i.e., label noise). Training detectors for such objects using noisy labels often leads to suboptimal performance, with networks tending to overfit on noisy labels. In this study, we address the intricate issue of tiny object detection under noisy label supervision. We systematically investigate the impact of various types of noise on network training, revealing the vulnerability of object detectors to class shifts and inaccurate bounding boxes for tiny objects. To mitigate these challenges, we propose a DeNoising Tiny Object Detector (DN-TOD), which incorporates a Class-aware Label Correction (CLC) scheme to address class shifts and a Trend-guided Learning Strategy (TLS) to handle bounding box noise. CLC mitigates inaccurate class supervision by identifying and filtering out class-shifted positive samples, while TLS reduces noisy box-induced erroneous supervision through sample reweighting and bounding box regeneration. Additionally, Our method can be seamlessly integrated into both one-stage and two-stage object detection pipelines. Comprehensive experiments conducted on synthetic (i.e., noisy AI-TOD-v2.0 and DOTA-v2.0) and real-world (i.e., AI-TOD) noisy datasets demonstrate the robustness of DN-TOD under various types of label noise. Notably, when applied to the strong baseline RFLA, DN-TOD exhibits a noteworthy performance improvement of 4.9 points under 40% mixed noise. Datasets, codes, and models will be made publicly available.
