Table of Contents
Fetching ...

PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

Botao Ren, Xue Yang, Yi Yu, Junwei Luo, Zhidong Deng

TL;DR

This paper proposes PointOBB-v2, a simpler, faster, and stronger method to generate pseudo rotated boxes from points without relying on any other prior, and resolves the confusion caused by the overlapping on the CPM, enabling its operation in high-density scenarios.

Abstract

Single point supervised oriented object detection has gained attention and made initial progress within the community. Diverse from those approaches relying on one-shot samples or powerful pretrained models (e.g. SAM), PointOBB has shown promise due to its prior-free feature. In this paper, we propose PointOBB-v2, a simpler, faster, and stronger method to generate pseudo rotated boxes from points without relying on any other prior. Specifically, we first generate a Class Probability Map (CPM) by training the network with non-uniform positive and negative sampling. We show that the CPM is able to learn the approximate object regions and their contours. Then, Principal Component Analysis (PCA) is applied to accurately estimate the orientation and the boundary of objects. By further incorporating a separation mechanism, we resolve the confusion caused by the overlapping on the CPM, enabling its operation in high-density scenarios. Extensive comparisons demonstrate that our method achieves a training speed 15.58x faster and an accuracy improvement of 11.60%/25.15%/21.19% on the DOTA-v1.0/v1.5/v2.0 datasets compared to the previous state-of-the-art, PointOBB. This significantly advances the cutting edge of single point supervised oriented detection in the modular track.

PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

TL;DR

This paper proposes PointOBB-v2, a simpler, faster, and stronger method to generate pseudo rotated boxes from points without relying on any other prior, and resolves the confusion caused by the overlapping on the CPM, enabling its operation in high-density scenarios.

Abstract

Single point supervised oriented object detection has gained attention and made initial progress within the community. Diverse from those approaches relying on one-shot samples or powerful pretrained models (e.g. SAM), PointOBB has shown promise due to its prior-free feature. In this paper, we propose PointOBB-v2, a simpler, faster, and stronger method to generate pseudo rotated boxes from points without relying on any other prior. Specifically, we first generate a Class Probability Map (CPM) by training the network with non-uniform positive and negative sampling. We show that the CPM is able to learn the approximate object regions and their contours. Then, Principal Component Analysis (PCA) is applied to accurately estimate the orientation and the boundary of objects. By further incorporating a separation mechanism, we resolve the confusion caused by the overlapping on the CPM, enabling its operation in high-density scenarios. Extensive comparisons demonstrate that our method achieves a training speed 15.58x faster and an accuracy improvement of 11.60%/25.15%/21.19% on the DOTA-v1.0/v1.5/v2.0 datasets compared to the previous state-of-the-art, PointOBB. This significantly advances the cutting edge of single point supervised oriented detection in the modular track.

Paper Structure

This paper contains 17 sections, 8 equations, 4 figures, 9 tables.

Figures (4)

  • Figure 1: Compare with existing point supervised methods, including (a) Prompt OOD (i.e. SAM based); (b-c) Weakly OOD: Prior-based and Modular. OOD means Oriented Object Detection.
  • Figure 2: Our PointOBB-v2 first generates a Class Probability Map (CPM) with a positive and negative sample assignment strategy during training. It then applies Principal Component Analysis (PCA) to infer object orientation and boundaries to generate pseudo labels.
  • Figure 3: Visualization of class probability map (CPM).
  • Figure 4: Visualization of pseudo-labels and detection results from our model compared to PointOBB. For clarity, label texts for dense objects are hidden.