Table of Contents
Fetching ...

Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation

Zeyang Zhao, Qilong Xue, Yuhang He, Yifan Bai, Xing Wei, Yihong Gong

TL;DR

This paper tackles oriented object detection by addressing loss discontinuities from angle-based rotated boxes and edge-case ambiguities in existing representations. It introduces a point-axis representation that decouples location from orientation, representing objects with a point set $\mathcal{P}_i$ of size $K$ and an axis encoding $\mathcal{A}_i$ discretized into $N_{bins}$ angular bins (default $360$) with a four-peak circular encoding, supervised by a max-projection loss and a cross-axis loss. Built on DETR, Oriented DETR uses conditioned point queries and a dedicated points detection decoder to predict $\hat{\mathcal{P}}_i$, $\hat{c}_i$, and $\hat{\mathcal{A}}_i$ end-to-end. Experiments on DOTA, DIOR-R, HRSC2016 and COCO demonstrate strong improvements over state-of-the-art methods and show the approach generalizes beyond aerial datasets, underscoring its robustness and practical impact for oriented object detection.

Abstract

This paper introduces the point-axis representation for oriented object detection, emphasizing its flexibility and geometrically intuitive nature with two key components: points and axes. 1) Points delineate the spatial extent and contours of objects, providing detailed shape descriptions. 2) Axes define the primary directionalities of objects, providing essential orientation cues crucial for precise detection. The point-axis representation decouples location and rotation, addressing the loss discontinuity issues commonly encountered in traditional bounding box-based approaches. For effective optimization without introducing additional annotations, we propose the max-projection loss to supervise point set learning and the cross-axis loss for robust axis representation learning. Further, leveraging this representation, we present the Oriented DETR model, seamlessly integrating the DETR framework for precise point-axis prediction and end-to-end detection. Experimental results demonstrate significant performance improvements in oriented object detection tasks.

Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation

TL;DR

This paper tackles oriented object detection by addressing loss discontinuities from angle-based rotated boxes and edge-case ambiguities in existing representations. It introduces a point-axis representation that decouples location from orientation, representing objects with a point set of size and an axis encoding discretized into angular bins (default ) with a four-peak circular encoding, supervised by a max-projection loss and a cross-axis loss. Built on DETR, Oriented DETR uses conditioned point queries and a dedicated points detection decoder to predict , , and end-to-end. Experiments on DOTA, DIOR-R, HRSC2016 and COCO demonstrate strong improvements over state-of-the-art methods and show the approach generalizes beyond aerial datasets, underscoring its robustness and practical impact for oriented object detection.

Abstract

This paper introduces the point-axis representation for oriented object detection, emphasizing its flexibility and geometrically intuitive nature with two key components: points and axes. 1) Points delineate the spatial extent and contours of objects, providing detailed shape descriptions. 2) Axes define the primary directionalities of objects, providing essential orientation cues crucial for precise detection. The point-axis representation decouples location and rotation, addressing the loss discontinuity issues commonly encountered in traditional bounding box-based approaches. For effective optimization without introducing additional annotations, we propose the max-projection loss to supervise point set learning and the cross-axis loss for robust axis representation learning. Further, leveraging this representation, we present the Oriented DETR model, seamlessly integrating the DETR framework for precise point-axis prediction and end-to-end detection. Experimental results demonstrate significant performance improvements in oriented object detection tasks.
Paper Structure (14 sections, 6 equations, 7 figures, 8 tables)

This paper contains 14 sections, 6 equations, 7 figures, 8 tables.

Figures (7)

  • Figure 1: Robust oriented object detection results with point-axis representation.
  • Figure 2: Mainstream oriented object representations. (a) is the rotated bounding box representation, (b) is the point set representation, and (c) is the point-axis representation we propose.
  • Figure 3: The overall framework of the point-axis representation. (a) provides a visual depiction of the representation method. (b) outlines the overall process of loss constraints. (c) introduces the key loss, max projection loss, proposed by us for supervising the point set learning.
  • Figure 4: The architecture of Oriented DETR.
  • Figure 5: The distributions of predicted axes for objects without clear orientation definition.
  • ...and 2 more figures