Table of Contents
Fetching ...

GRA: Detecting Oriented Objects through Group-wise Rotating and Attention

Jiangshan Wang, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li, Gao Huang

TL;DR

A lightweight yet effective Group-wise Rotating and Attention (GRA) module to replace the convolution operations in backbone networks for oriented object detection and achieves a new state-of-the-art (SOTA) on the DOTA-v2.0 benchmark, while saving the parameters by nearly 50% compared to the previous SOTA method.

Abstract

Oriented object detection, an emerging task in recent years, aims to identify and locate objects across varied orientations. This requires the detector to accurately capture the orientation information, which varies significantly within and across images. Despite the existing substantial efforts, simultaneously ensuring model effectiveness and parameter efficiency remains challenging in this scenario. In this paper, we propose a lightweight yet effective Group-wise Rotating and Attention (GRA) module to replace the convolution operations in backbone networks for oriented object detection. GRA can adaptively capture fine-grained features of objects with diverse orientations, comprising two key components: Group-wise Rotating and Group-wise Attention. Group-wise Rotating first divides the convolution kernel into groups, where each group extracts different object features by rotating at a specific angle according to the object orientation. Subsequently, Group-wise Attention is employed to adaptively enhance the object-related regions in the feature. The collaborative effort of these components enables GRA to effectively capture the various orientation information while maintaining parameter efficiency. Extensive experimental results demonstrate the superiority of our method. For example, GRA achieves a new state-of-the-art (SOTA) on the DOTA-v2.0 benchmark, while saving the parameters by nearly 50% compared to the previous SOTA method. Code will be released.

GRA: Detecting Oriented Objects through Group-wise Rotating and Attention

TL;DR

A lightweight yet effective Group-wise Rotating and Attention (GRA) module to replace the convolution operations in backbone networks for oriented object detection and achieves a new state-of-the-art (SOTA) on the DOTA-v2.0 benchmark, while saving the parameters by nearly 50% compared to the previous SOTA method.

Abstract

Oriented object detection, an emerging task in recent years, aims to identify and locate objects across varied orientations. This requires the detector to accurately capture the orientation information, which varies significantly within and across images. Despite the existing substantial efforts, simultaneously ensuring model effectiveness and parameter efficiency remains challenging in this scenario. In this paper, we propose a lightweight yet effective Group-wise Rotating and Attention (GRA) module to replace the convolution operations in backbone networks for oriented object detection. GRA can adaptively capture fine-grained features of objects with diverse orientations, comprising two key components: Group-wise Rotating and Group-wise Attention. Group-wise Rotating first divides the convolution kernel into groups, where each group extracts different object features by rotating at a specific angle according to the object orientation. Subsequently, Group-wise Attention is employed to adaptively enhance the object-related regions in the feature. The collaborative effort of these components enables GRA to effectively capture the various orientation information while maintaining parameter efficiency. Extensive experimental results demonstrate the superiority of our method. For example, GRA achieves a new state-of-the-art (SOTA) on the DOTA-v2.0 benchmark, while saving the parameters by nearly 50% compared to the previous SOTA method. Code will be released.
Paper Structure (13 sections, 6 equations, 5 figures, 7 tables)

This paper contains 13 sections, 6 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Comparison between the recent SOTA method ARC pu2023adaptive and our proposed method GRA. Our method not only reduces the noise in the features but is also much more lightweight.
  • Figure 2: Distribution of the confidence predicted by ARC ($m=1$). The angle predicted by ARC module can affect the final prediction of the objects with different angles in the images. In general, the objects whose orientation is close to the angle predicted by ARC module can be detected with higher confidence compared to the objects whose orientation diverges from the ARC-predicted angle.
  • Figure 3: Comparison of the prediction between ARC and our method. The weighted sum of ARC can lead to inaccurate features, causing a number of low-confidence detections. On the other hand, our method can detect more objects in the images with higher confidence.
  • Figure 4: An overview of our proposed GRA method, which contains two components: Group-wise Rotating and Group-wise Attention. In Group-wise Rotating module, the input kernel $\boldsymbol{W} \in \mathbb{R}^{C_{\text{out}}\times C_{\text{in}}\times k\times k}$ is rotated in a group-wise manner, obtaining $\boldsymbol{\widetilde{W}}$, which performs the convolution with input feature map $\boldsymbol{x}\in \mathbb{R}^{C_{\text{in}}\times H_{\text{in}} \times W_{\text{in}}}$. The output $\boldsymbol{y}\in \mathbb{R}^{C_{\text{out}}\times H_{\text{out}} \times W_{\text{out}}}$ is then fed into the group-wise attention module for denoising and refining, obtaining the final output $\boldsymbol{\widetilde{y}}$.
  • Figure 5: The visualisation of the prediction of ARC pu2023adaptive and our model.