Table of Contents
Fetching ...

Relaxed Rotational Equivariance via $G$-Biases in Vision

Zhiqiang Wu, Yingjie Liu, Licheng Sun, Jian Yang, Hanlin Dong, Shing-Ho J. Lin, Xuan Tang, Jinpeng Mi, Bo Jin, Xian Wei

TL;DR

This work addresses the gap between ideal, strict rotational equivariance and the imperfect symmetry observed in real-world visual data. It introduces G-Biases to relax group-based filters, forming Relaxed Rotational Equivariant Convolution (RREConv) and enabling end-to-end learning of symmetry-breaking patterns. The proposed RREF, together with RREConv, yields Relaxed Rotational Equivariance Networks (RRENet) and Detectors (RREDet) that outperform strict-equivariant and non-equivariant baselines in both image classification and 2D object detection, with modest increases in parameters. The approach demonstrates that relaxing symmetry constraints via learnable biases can capture dataset-specific rotational variations, offering a practical, plug-and-play improvement for vision systems operating under RSb conditions.

Abstract

Group Equivariant Convolution (GConv) can capture rotational equivariance from original data. It assumes uniform and strict rotational equivariance across all features as the transformations under the specific group. However, the presentation or distribution of real-world data rarely conforms to strict rotational equivariance, commonly referred to as Rotational Symmetry-Breaking (RSB) in the system or dataset, making GConv unable to adapt effectively to this phenomenon. Motivated by this, we propose a simple but highly effective method to address this problem, which utilizes a set of learnable biases called $G$-Biases under the group order to break strict group constraints and then achieve a Relaxed Rotational Equivariant Convolution (RREConv). To validate the efficiency of RREConv, we conduct extensive ablation experiments on the discrete rotational group $\mathcal{C}_n$. Experiments demonstrate that the proposed RREConv-based methods achieve excellent performance compared to existing GConv-based methods in both classification and 2D object detection tasks on the natural image datasets.

Relaxed Rotational Equivariance via $G$-Biases in Vision

TL;DR

This work addresses the gap between ideal, strict rotational equivariance and the imperfect symmetry observed in real-world visual data. It introduces G-Biases to relax group-based filters, forming Relaxed Rotational Equivariant Convolution (RREConv) and enabling end-to-end learning of symmetry-breaking patterns. The proposed RREF, together with RREConv, yields Relaxed Rotational Equivariance Networks (RRENet) and Detectors (RREDet) that outperform strict-equivariant and non-equivariant baselines in both image classification and 2D object detection, with modest increases in parameters. The approach demonstrates that relaxing symmetry constraints via learnable biases can capture dataset-specific rotational variations, offering a practical, plug-and-play improvement for vision systems operating under RSb conditions.

Abstract

Group Equivariant Convolution (GConv) can capture rotational equivariance from original data. It assumes uniform and strict rotational equivariance across all features as the transformations under the specific group. However, the presentation or distribution of real-world data rarely conforms to strict rotational equivariance, commonly referred to as Rotational Symmetry-Breaking (RSB) in the system or dataset, making GConv unable to adapt effectively to this phenomenon. Motivated by this, we propose a simple but highly effective method to address this problem, which utilizes a set of learnable biases called -Biases under the group order to break strict group constraints and then achieve a Relaxed Rotational Equivariant Convolution (RREConv). To validate the efficiency of RREConv, we conduct extensive ablation experiments on the discrete rotational group . Experiments demonstrate that the proposed RREConv-based methods achieve excellent performance compared to existing GConv-based methods in both classification and 2D object detection tasks on the natural image datasets.
Paper Structure (36 sections, 16 equations, 6 figures, 4 tables)

This paper contains 36 sections, 16 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: (a) A car turning right at an angle of exactly 90 degrees denotes the strict adherence to the motion rules on the group $\mathcal{C}_4$. (b) Another car turning right at an angle of approximately 90 degrees represents a deviation from strict rotational symmetry on the group $\mathcal{C}_4$, leading to Rotational Symmetry-Breaking (RSB) within a car's motion. Note that the figure emphasizes the symmetry of an object's potential motion, not the symmetry of an object itself on the group $\mathcal{C}_4$.
  • Figure 2: The $2 \times 2$ filters between Strict Rotational Equivariance (SRE) and Relaxed Rotational Equivariance (RRE) on the group $\mathcal{C}_4$. Filters 1 to 4 in Figure (a) have the same values in four directions, whereas Filters 1 to 4 in Figure (b) have slightly different values in four directions.
  • Figure 3: The construction of Relaxed Rotational Equivariant Filter (RREF). Note that the initial weights have $G_l$ columns, but only the $G$-transformation in the last column (Red Box) is shown here for the convenience of drawing. The operations in other columns (Gray Box) are the same as the last column (Red Box).
  • Figure 4: The architecture of the backbone RRENet-n based on RREConv. Note that $n$ denotes the dimension of $\mathcal{C}_n$, and "-CBA" means "Conv + BatchNorm + Activate" operations.
  • Figure 5: The architecture of RREDet. Note that #{2,3,4} denote muti-scale feature maps from {2,3,4}-layer in the backbone RRENet. The PD RREUp adopts the same structure as the PD RREConv, except for transposed convolution.
  • ...and 1 more figures