Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning
Chang Xu, Ruixiang Zhang, Wen Yang, Haoran Zhu, Fang Xu, Jian Ding, Gui-Song Xia
TL;DR
This work tackles oriented tiny object detection, a setting with extreme scale ($mean~object~size~$10.6$^{2}$ pixels) and arbitrary orientation, by introducing AI-TOD-R, a challenging dataset, a corresponding benchmark, and a Dynamic Coarse-to-Fine Learning (DCFL) pipeline. DCFL combines a dynamic Prior Capturing Block to adapt priors to object extents and a two-stage sampling regime—coarse positive sampling across scales and a finer posterior matching using a Dynamic Gaussian Mixture Model—to overcome bias against tiny objects. Across eight heterogeneous benchmarks, DCFL delivers state-of-the-art accuracy without additional inference cost, confirming its versatility for one-stage and two-stage detectors and its effectiveness under fully-supervised and label-efficient settings; notable gains include improvements in $AP_{0.5}$ and robustness to extreme object sizes. The practical impact lies in enabling reliable detection of densely packed, orientation-variant tiny objects in aerial and remote sensing imagery, with open-source code facilitating adoption and further research; future work may extend to open-world settings, multi-modality, and foundation-model integration. $AP$ and $IoU$-oriented metrics are used throughout, and key ideas are expressed through $DGMM$, $GJSD$, and dynamic priors that adapt during training.
Abstract
Detecting oriented tiny objects, which are limited in appearance information yet prevalent in real-world applications, remains an intricate and under-explored problem. To address this, we systemically introduce a new dataset, benchmark, and a dynamic coarse-to-fine learning scheme in this study. Our proposed dataset, AI-TOD-R, features the smallest object sizes among all oriented object detection datasets. Based on AI-TOD-R, we present a benchmark spanning a broad range of detection paradigms, including both fully-supervised and label-efficient approaches. Through investigation, we identify a learning bias presents across various learning pipelines: confident objects become increasingly confident, while vulnerable oriented tiny objects are further marginalized, hindering their detection performance. To mitigate this issue, we propose a Dynamic Coarse-to-Fine Learning (DCFL) scheme to achieve unbiased learning. DCFL dynamically updates prior positions to better align with the limited areas of oriented tiny objects, and it assigns samples in a way that balances both quantity and quality across different object shapes, thus mitigating biases in prior settings and sample selection. Extensive experiments across eight challenging object detection datasets demonstrate that DCFL achieves state-of-the-art accuracy, high efficiency, and remarkable versatility. The dataset, benchmark, and code are available at https://chasel-tsui.github.io/AI-TOD-R/.
