Table of Contents
Fetching ...

ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection

Ying Zeng, Yushi Chen, Xue Yang, Qingyun Li, Junchi Yan

TL;DR

This work tackles the challenge of high-precision oriented object detection in aerial imagery by showing $AP_{50}$ is insufficient for evaluating orientation accuracy. It proposes ARS-DETR, a DETR-based framework that integrates Aspect Ratio Aware Circle Smooth Label (AR-CSL), a Rotated Deformable Attention (RDA) module, and aspect-ratio–sensitive matching and loss (ARM/ARL), together with denoising training. Empirical results across DOTA-v1.0, DIOR-R, and OHD-SJTU demonstrate competitive $AP_{50}$ and, more importantly, consistently superior $AP_{75}$, underscoring the approach’s effectiveness for high-precision oriented detection. The findings highlight the value of angle–aspect-ratio coupling and hyperparameter-free smoothing in improving DETR-based oriented detectors for aerial imagery applications.

Abstract

Existing oriented object detection methods commonly use metric AP$_{50}$ to measure the performance of the model. We argue that AP$_{50}$ is inherently unsuitable for oriented object detection due to its large tolerance in angle deviation. Therefore, we advocate using high-precision metric, e.g. AP$_{75}$, to measure the performance of models. In this paper, we propose an Aspect Ratio Sensitive Oriented Object Detector with Transformer, termed ARS-DETR, which exhibits a competitive performance in high-precision oriented object detection. Specifically, a new angle classification method, calling Aspect Ratio aware Circle Smooth Label (AR-CSL), is proposed to smooth the angle label in a more reasonable way and discard the hyperparameter that introduced by previous work (e.g. CSL). Then, a rotated deformable attention module is designed to rotate the sampling points with the corresponding angles and eliminate the misalignment between region features and sampling points. Moreover, a dynamic weight coefficient according to the aspect ratio is adopted to calculate the angle loss. Comprehensive experiments on several challenging datasets show that our method achieves competitive performance on the high-precision oriented object detection task.

ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection

TL;DR

This work tackles the challenge of high-precision oriented object detection in aerial imagery by showing is insufficient for evaluating orientation accuracy. It proposes ARS-DETR, a DETR-based framework that integrates Aspect Ratio Aware Circle Smooth Label (AR-CSL), a Rotated Deformable Attention (RDA) module, and aspect-ratio–sensitive matching and loss (ARM/ARL), together with denoising training. Empirical results across DOTA-v1.0, DIOR-R, and OHD-SJTU demonstrate competitive and, more importantly, consistently superior , underscoring the approach’s effectiveness for high-precision oriented detection. The findings highlight the value of angle–aspect-ratio coupling and hyperparameter-free smoothing in improving DETR-based oriented detectors for aerial imagery applications.

Abstract

Existing oriented object detection methods commonly use metric AP to measure the performance of the model. We argue that AP is inherently unsuitable for oriented object detection due to its large tolerance in angle deviation. Therefore, we advocate using high-precision metric, e.g. AP, to measure the performance of models. In this paper, we propose an Aspect Ratio Sensitive Oriented Object Detector with Transformer, termed ARS-DETR, which exhibits a competitive performance in high-precision oriented object detection. Specifically, a new angle classification method, calling Aspect Ratio aware Circle Smooth Label (AR-CSL), is proposed to smooth the angle label in a more reasonable way and discard the hyperparameter that introduced by previous work (e.g. CSL). Then, a rotated deformable attention module is designed to rotate the sampling points with the corresponding angles and eliminate the misalignment between region features and sampling points. Moreover, a dynamic weight coefficient according to the aspect ratio is adopted to calculate the angle loss. Comprehensive experiments on several challenging datasets show that our method achieves competitive performance on the high-precision oriented object detection task.
Paper Structure (23 sections, 9 equations, 13 figures, 13 tables)

This paper contains 23 sections, 9 equations, 13 figures, 13 tables.

Figures (13)

  • Figure 1: Even though the angle prediction is inaccurate, it still obtains a high performance in terms of AP$_{50}$.
  • Figure 2: The curves represent the relationship between SkewIoU and angle deviation $\Delta \theta$ under different aspect ratios. $k$ indicates the aspect ratio.
  • Figure 3: Two situations for SkewIoU calculation. (a) The situation where $\Delta \theta < \theta^{*}$; (b) The boundary condition between situation 1 and situation 2; (c) The situation where $\Delta \theta > \theta^{*}$.
  • Figure 4: The framework of the proposed ARS-DETR. ‘GT’ means ground truth. ‘Train Only’ means it only works during the training process and will be removed during the inference.
  • Figure 5: The comparison of two encoding methods in objects with different aspect ratio at each angle deviation. For the convenience of comparison, the labels in the Fig. (a) and Fig. (c)-(f) are flatly unfolded, otherwise they should be circular like (b). (a) For CSL, a Gaussian window with a fixed window radius will be adopted to smooth the angle label, regardless of the objects' aspect ratio. (c)-(d) For AR-CSL, objects with different aspect ratio will be considered and it will use a more reasonable smoothing strategy to reflect the correlation among the adjacent angles. (e) For CSL, angle discrete granularity $\omega$ will be overlooked and will give the same smoothing values under different angle discrete granularity $\omega$ (d) For AR-CSL, smoothing values are calculated dynamically according to the angle deviation and will vary under different angle discrete granularity $\omega$.
  • ...and 8 more figures