Table of Contents
Fetching ...

R2Det: Exploring Relaxed Rotation Equivariance in 2D object detection

Zhiqiang Wu, Yingjie Liu, Hanlin Dong, Xuan Tang, Jian Yang, Bo Jin, Mingsong Chen, Xian Wei

TL;DR

This work tackles Rotational Symmetry-Breaking in 2D object detection by introducing a Relaxed Rotation-Equivariant framework. It defines a Relaxed Rotation-Equivariant GConv (R2GConv) built on a learnable perturbation $ \boldsymbol{\triangle} $ that perturbs the rotation group from $ \mathbf{C}_n $ to $ \mathbf{R}_n $, enabling end-to-end learning of relaxed equivariance. The authors design a lightweight backbone (R2Net) and a redesigned detector (R2Det) that leverage R2GConv with an efficient mix of lifting, point-wise, and depth-wise operations, achieving faster convergence and superior AP on VOC and COCO with fewer parameters and lower FLOPs. Extensive ablations, visualizations, and plug-and-play experiments (e.g., with YOLOv8) validate the effectiveness and generality of R2GConv and RRE modeling for real-world symmetry-breaking scenarios. The approach offers practical impact by enabling robust, efficient rotation-aware detection under imperfect symmetry while preserving compatibility with existing architectures and tasks.

Abstract

Group Equivariant Convolution (GConv) empowers models to explore underlying symmetry in data, improving performance. However, real-world scenarios often deviate from ideal symmetric systems caused by physical permutation, characterized by non-trivial actions of a symmetry group, resulting in asymmetries that affect the outputs, a phenomenon known as Symmetry Breaking. Traditional GConv-based methods are constrained by rigid operational rules within group space, assuming data remains strictly symmetry after limited group transformations. This limitation makes it difficult to adapt to Symmetry-Breaking and non-rigid transformations. Motivated by this, we mainly focus on a common scenario: Rotational Symmetry-Breaking. By relaxing strict group transformations within Strict Rotation-Equivariant group $\mathbf{C}_n$, we redefine a Relaxed Rotation-Equivariant group $\mathbf{R}_n$ and introduce a novel Relaxed Rotation-Equivariant GConv (R2GConv) with only a minimal increase of $4n$ parameters compared to GConv. Based on R2GConv, we propose a Relaxed Rotation-Equivariant Network (R2Net) as the backbone and develop a Relaxed Rotation-Equivariant Object Detector (R2Det) for 2D object detection. Experimental results demonstrate the effectiveness of the proposed R2GConv in natural image classification, and R2Det achieves excellent performance in 2D object detection with improved generalization capabilities and robustness. The code is available in \texttt{https://github.com/wuer5/r2det}.

R2Det: Exploring Relaxed Rotation Equivariance in 2D object detection

TL;DR

This work tackles Rotational Symmetry-Breaking in 2D object detection by introducing a Relaxed Rotation-Equivariant framework. It defines a Relaxed Rotation-Equivariant GConv (R2GConv) built on a learnable perturbation that perturbs the rotation group from to , enabling end-to-end learning of relaxed equivariance. The authors design a lightweight backbone (R2Net) and a redesigned detector (R2Det) that leverage R2GConv with an efficient mix of lifting, point-wise, and depth-wise operations, achieving faster convergence and superior AP on VOC and COCO with fewer parameters and lower FLOPs. Extensive ablations, visualizations, and plug-and-play experiments (e.g., with YOLOv8) validate the effectiveness and generality of R2GConv and RRE modeling for real-world symmetry-breaking scenarios. The approach offers practical impact by enabling robust, efficient rotation-aware detection under imperfect symmetry while preserving compatibility with existing architectures and tasks.

Abstract

Group Equivariant Convolution (GConv) empowers models to explore underlying symmetry in data, improving performance. However, real-world scenarios often deviate from ideal symmetric systems caused by physical permutation, characterized by non-trivial actions of a symmetry group, resulting in asymmetries that affect the outputs, a phenomenon known as Symmetry Breaking. Traditional GConv-based methods are constrained by rigid operational rules within group space, assuming data remains strictly symmetry after limited group transformations. This limitation makes it difficult to adapt to Symmetry-Breaking and non-rigid transformations. Motivated by this, we mainly focus on a common scenario: Rotational Symmetry-Breaking. By relaxing strict group transformations within Strict Rotation-Equivariant group , we redefine a Relaxed Rotation-Equivariant group and introduce a novel Relaxed Rotation-Equivariant GConv (R2GConv) with only a minimal increase of parameters compared to GConv. Based on R2GConv, we propose a Relaxed Rotation-Equivariant Network (R2Net) as the backbone and develop a Relaxed Rotation-Equivariant Object Detector (R2Det) for 2D object detection. Experimental results demonstrate the effectiveness of the proposed R2GConv in natural image classification, and R2Det achieves excellent performance in 2D object detection with improved generalization capabilities and robustness. The code is available in \texttt{https://github.com/wuer5/r2det}.
Paper Structure (27 sections, 2 theorems, 17 equations, 10 figures, 16 tables, 1 algorithm)

This paper contains 27 sections, 2 theorems, 17 equations, 10 figures, 16 tables, 1 algorithm.

Key Result

Proposition 7.2

Let $\phi_{gt}$ be relaxed (or $\epsilon$-approximate) equivariant and Lipschitz with constant $k$. Then, we have

Figures (10)

  • Figure 1: Left: The ideal feature of a circle with higher symmetry rarely occurs in real-world scenarios. Instead, the physical perturbation of higher symmetry results in lower symmetry. While ENNs can handle higher symmetry, Curie’s principle dictates features of higher symmetry cannot be mapped to outputs with lower symmetry, inducing Symmetry-Breaking situations, and impairing feature learning. RRE function $\phi_{\texttt{RRE}}$ can solve Rotational Symmetry-Breaking situations, which has been proven by kaba2023symmetry. Note that $\texttt{Sym}(\cdot)$ denotes the level of its symmetry. Right: This work proposes the R2EFilter to build RRE, by incorporating learnable perturbation $\Delta$. We show the forward and backward processes of R2EFilter based on relaxed $\mathbf{C}_4$, named $\mathbf{R}_4$.
  • Figure 2: The architecture of R2Net-N as the backbone for feature extraction, where #C denotes the number of Bottlenecks in R2Net Blocks based on the channel sizes and varies with the different sizes of R2Net (i.e., R2Det-N / S / M). Note that all R2GConv including three variants have normalization (BatchNorm) and activation (SiLU) functions, which we do not show in the figure.
  • Figure 3: The architecture of R2Det for 2D object detection with a FPN+PAN neck. Here, we only show a simple architectural diagram. For detailed architecture, please refer to Appendix \ref{['sec: appendix-r2det']}.
  • Figure 4: $\textbf{AP}_{50}$ curves on VOC test dataset. All models train for $200$ epochs with the same settings.
  • Figure 5: Visualization of the rotated feature maps in RRE (Ours), SRE, and NRE based on $\mathbf{C}_4$.
  • ...and 5 more figures

Theorems & Definitions (3)

  • Definition 7.1: Equivariance Error
  • Proposition 7.2
  • Proposition 7.3