Table of Contents
Fetching ...

Linear Gaussian Bounding Box Representation and Ring-Shaped Rotated Convolution for Oriented Object Detection

Zhen Zhou, Yunkai Ma, Junfeng Fan, Zhaoyang Liu, Fengshui Jing, Min Tan

TL;DR

The paper tackles boundary discontinuity and numerical instability in oriented object detection by introducing Linear Gaussian Bounding Box (LGBB), which linearizes Gaussian bounding box parameters to stabilize regression, and Ring-Shaped Rotated Convolution (RRC), which adaptively rotates feature maps to extract rotation-sensitive information under a ring-shaped receptive field. LGBB avoids discontinuities by mapping OBBs into a linearly transformed Gaussian representation and adds a positive definite constraint to ensure covariance validity, while RRC accelerates the aggregation of rotation-aware features across orientations with modular components like AGM, PEM, and RFEM. The approach achieves state-of-the-art results on DOTA-v1.5 and HRSC2016, and is demonstrated to be plug-and-play across detectors like YOLOv7 and mmrotate-based models, improving accuracy with manageable speed cost. These contributions offer a practical and robust path to more accurate and efficient oriented object detection in aerial and maritime imagery, with potential for broader rotation-sensitive visual tasks.

Abstract

In oriented object detection, current representations of oriented bounding boxes (OBBs) often suffer from boundary discontinuity problem. Methods of designing continuous regression losses do not essentially solve this problem. Although Gaussian bounding box (GBB) representation avoids this problem, directly regressing GBB is susceptible to numerical instability. We propose linear GBB (LGBB), a novel OBB representation. By linearly transforming the elements of GBB, LGBB avoids the boundary discontinuity problem and has high numerical stability. In addition, existing convolution-based rotation-sensitive feature extraction methods only have local receptive fields, resulting in slow feature aggregation. We propose ring-shaped rotated convolution (RRC), which adaptively rotates feature maps to arbitrary orientations to extract rotation-sensitive features under a ring-shaped receptive field, rapidly aggregating features and contextual information. Experimental results demonstrate that LGBB and RRC achieve state-of-the-art performance. Furthermore, integrating LGBB and RRC into various models effectively improves detection accuracy.

Linear Gaussian Bounding Box Representation and Ring-Shaped Rotated Convolution for Oriented Object Detection

TL;DR

The paper tackles boundary discontinuity and numerical instability in oriented object detection by introducing Linear Gaussian Bounding Box (LGBB), which linearizes Gaussian bounding box parameters to stabilize regression, and Ring-Shaped Rotated Convolution (RRC), which adaptively rotates feature maps to extract rotation-sensitive information under a ring-shaped receptive field. LGBB avoids discontinuities by mapping OBBs into a linearly transformed Gaussian representation and adds a positive definite constraint to ensure covariance validity, while RRC accelerates the aggregation of rotation-aware features across orientations with modular components like AGM, PEM, and RFEM. The approach achieves state-of-the-art results on DOTA-v1.5 and HRSC2016, and is demonstrated to be plug-and-play across detectors like YOLOv7 and mmrotate-based models, improving accuracy with manageable speed cost. These contributions offer a practical and robust path to more accurate and efficient oriented object detection in aerial and maritime imagery, with potential for broader rotation-sensitive visual tasks.

Abstract

In oriented object detection, current representations of oriented bounding boxes (OBBs) often suffer from boundary discontinuity problem. Methods of designing continuous regression losses do not essentially solve this problem. Although Gaussian bounding box (GBB) representation avoids this problem, directly regressing GBB is susceptible to numerical instability. We propose linear GBB (LGBB), a novel OBB representation. By linearly transforming the elements of GBB, LGBB avoids the boundary discontinuity problem and has high numerical stability. In addition, existing convolution-based rotation-sensitive feature extraction methods only have local receptive fields, resulting in slow feature aggregation. We propose ring-shaped rotated convolution (RRC), which adaptively rotates feature maps to arbitrary orientations to extract rotation-sensitive features under a ring-shaped receptive field, rapidly aggregating features and contextual information. Experimental results demonstrate that LGBB and RRC achieve state-of-the-art performance. Furthermore, integrating LGBB and RRC into various models effectively improves detection accuracy.
Paper Structure (26 sections, 13 equations, 11 figures, 8 tables, 1 algorithm)

This paper contains 26 sections, 13 equations, 11 figures, 8 tables, 1 algorithm.

Figures (11)

  • Figure 1: Seven different OBB representations. Different from some OBB representations suffer from the boundary discontinuity problem ((i)-(v)), GBB (vi) and LGBB (vii), which are modeled based on Gaussian distributions, do not have such problem. Compared with GBB, LGBB achieves high numerical stability by linearly transforming the elements of GBB.
  • Figure 2: Two different types of rotation-sensitive feature extraction methods. Left: By improving standard convolution kernels (e.g., rotating convolution kernels) or adjusting original feature maps (e.g., deformable convolution DCN). Right: By adaptively rotating feature maps. During the feature map rotation process, the receptive field of convolution is ring-shaped.
  • Figure 3: Layout of LGBB and RRC in oriented object detectors. For example, the regression target of the oriented object detector is LGBB, and RRC is applied to the first layer of each downsampling stage from $P_2$ to $P_6$ to extract rotation-sensitive features.
  • Figure 4: Mapping an OBB into a GBB.
  • Figure 5: Comparisons between regression on GBB and regression on LGBB. (a) The relationship between the gradient values of the losses with respect to GBB and LGBB (ordinate) and the object sizes (abscissa). (b) The range of values for $\Sigma$ in GBB and $L$ in LGBB.
  • ...and 6 more figures