Table of Contents
Fetching ...

Category-Aware Dynamic Label Assignment with High-Quality Oriented Proposal

Mingkui Feng, Hancheng Yu, Xiaoyu Dang, Ming Zhou

TL;DR

This work tackles angle regression challenges in oriented object detection for aerial imagery by introducing a complex-plane representation of angles and a differentiable trigonometric loss, effectively eliminating the angle boundary problem. It couples this loss with a Conformer RPN Head to adapt receptive fields and better learn angle information, and introduces a category-aware dynamic label assignment to align classification and regression through predicted category feedback. The combined approach achieves state-of-the-art or competitive results across DOTA-v1.0/v1.5, DIOR-R, and HRSC2016, with notable gains demonstrated under minimal tuning. Overall, the proposed TL F, Conformer RPN Head, and CDLA form a robust, efficient framework for high-quality oriented proposals in remote sensing object detection, with broad applicability to real-world aerial analytics.

Abstract

Objects in aerial images are typically embedded in complex backgrounds and exhibit arbitrary orientations. When employing oriented bounding boxes (OBB) to represent arbitrary oriented objects, the periodicity of angles could lead to discontinuities in label regression values at the boundaries, inducing abrupt fluctuations in the loss function. To address this problem, an OBB representation based on the complex plane is introduced in the oriented detection framework, and a trigonometric loss function is proposed. Moreover, leveraging prior knowledge of complex background environments and significant differences in large objects in aerial images, a conformer RPN head is constructed to predict angle information. The proposed loss function and conformer RPN head jointly generate high-quality oriented proposals. A category-aware dynamic label assignment based on predicted category feedback is proposed to address the limitations of solely relying on IoU for proposal label assignment. This method makes negative sample selection more representative, ensuring consistency between classification and regression features. Experiments were conducted on four realistic oriented detection datasets, and the results demonstrate superior performance in oriented object detection with minimal parameter tuning and time costs. Specifically, mean average precision (mAP) scores of 82.02%, 71.99%, 69.87%, and 98.77% were achieved on the DOTA-v1.0, DOTA-v1.5, DIOR-R, and HRSC2016 datasets, respectively.

Category-Aware Dynamic Label Assignment with High-Quality Oriented Proposal

TL;DR

This work tackles angle regression challenges in oriented object detection for aerial imagery by introducing a complex-plane representation of angles and a differentiable trigonometric loss, effectively eliminating the angle boundary problem. It couples this loss with a Conformer RPN Head to adapt receptive fields and better learn angle information, and introduces a category-aware dynamic label assignment to align classification and regression through predicted category feedback. The combined approach achieves state-of-the-art or competitive results across DOTA-v1.0/v1.5, DIOR-R, and HRSC2016, with notable gains demonstrated under minimal tuning. Overall, the proposed TL F, Conformer RPN Head, and CDLA form a robust, efficient framework for high-quality oriented proposals in remote sensing object detection, with broad applicability to real-world aerial analytics.

Abstract

Objects in aerial images are typically embedded in complex backgrounds and exhibit arbitrary orientations. When employing oriented bounding boxes (OBB) to represent arbitrary oriented objects, the periodicity of angles could lead to discontinuities in label regression values at the boundaries, inducing abrupt fluctuations in the loss function. To address this problem, an OBB representation based on the complex plane is introduced in the oriented detection framework, and a trigonometric loss function is proposed. Moreover, leveraging prior knowledge of complex background environments and significant differences in large objects in aerial images, a conformer RPN head is constructed to predict angle information. The proposed loss function and conformer RPN head jointly generate high-quality oriented proposals. A category-aware dynamic label assignment based on predicted category feedback is proposed to address the limitations of solely relying on IoU for proposal label assignment. This method makes negative sample selection more representative, ensuring consistency between classification and regression features. Experiments were conducted on four realistic oriented detection datasets, and the results demonstrate superior performance in oriented object detection with minimal parameter tuning and time costs. Specifically, mean average precision (mAP) scores of 82.02%, 71.99%, 69.87%, and 98.77% were achieved on the DOTA-v1.0, DOTA-v1.5, DIOR-R, and HRSC2016 datasets, respectively.
Paper Structure (26 sections, 7 equations, 10 figures, 7 tables, 1 algorithm)

This paper contains 26 sections, 7 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: Illustration of the motivation behind the design of the loss function, where arrows indicate direction, $\theta_{p}$ represents the predicted angle, $\theta_{g}$ represents the target angle, here $\theta_{g}$ takes $-\frac{\pi}{2}$ as an example. (a) Representation and regression of the object OBB angle in the image. (b) The loss curve of $\text{ smooth } L_{1}$ with respect to the predicted angle. (c) The loss curve of 1-IoU with respect to the predicted angle. (d) The relationship between predicted values and our loss function. (e) Representation and regression of the object angle in the complex plane. (f) The loss curve of our loss function with respect to the predicted angle.
  • Figure 2: Overall architecture of the proposed approach. The yellow dashed arrows indicate steps that occur only during training, while the blue solid arrows indicate steps shared between training and testing.
  • Figure 3: The horizontal coordinates indicate the angular error of the prediction and labeling. The width and height ratios of Figures (a), (b), and (c) are 2, 3, and 5, respectively.
  • Figure 4: (a) Conformer RPN Head. (b) Multi -Head Self-Attention.
  • Figure 5: Illustration of the problem of label assignment for negative samples in oriented object detection. The scores in the yellow boxes represent the predicted probabilities for the current object classes and the background class in multi-object classification, denoted as ${\mathcal{P}_{c}}_{(\textbf{TP})}$ and ${\mathcal{P}_{c}}_{(\textbf{BK})}$ respectively. The values in the gray boxes indicate the IoU between proposal boxes and ground truth. In the maximum IoU label assignment, green, blue, and pink respectively denote ground truth, positive samples, and negative samples. In our label assignment, negative samples are divided into ignored (white), normal (pink), and focused (red) negative samples based on feedback on predicted classification values.
  • ...and 5 more figures