Table of Contents
Fetching ...

Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured

Hanlin Mo, Guoying Zhao

TL;DR

The paper addresses the challenge of achieving rotation invariance in convolutional neural networks without relying on data augmentation. It introduces a family of rotation-invariant convolutions (RIConvs) built from non-learnable operators that preserve the same number of learnable parameters as standard convolutions and can be interchanged with them.Gradient-based RIConvs demonstrate state-of-the-art performance on MNIST-Rot, while integrating RIConvs with common CNN backbones yields substantial gains on texture, aircraft, and remote sensing tasks, particularly when training data are limited. The results also indicate that data augmentation, while beneficial, is complementary to mechanism-based invariance, underscoring the practical value of RIConvs for robust rotation handling in real-world applications.

Abstract

Achieving rotation invariance in deep neural networks without relying on data has always been a hot research topic. Intrinsic rotation invariance can enhance the model's feature representation capability, enabling better performance in tasks such as multi-orientation object recognition and detection. Based on various types of non-learnable operators, including gradient, sort, local binary pattern, maximum, etc., this paper designs a set of new convolution operations that are natually invariant to arbitrary rotations. Unlike most previous studies, these rotation-invariant convolutions (RIConvs) have the same number of learnable parameters and a similar computational process as conventional convolution operations, allowing them to be interchangeable. Using the MNIST-Rot dataset, we first verify the invariance of these RIConvs under various rotation angles and compare their performance with previous rotation-invariant convolutional neural networks (RI-CNNs). Two types of RIConvs based on gradient operators achieve state-of-the-art results. Subsequently, we combine RIConvs with different types and depths of classic CNN backbones. Using the OuTex_00012, MTARSI, and NWPU-RESISC-45 datasets, we test their performance on texture recognition, aircraft type recognition, and remote sensing image classification tasks. The results show that RIConvs significantly improve the accuracy of these CNN backbones, especially when the training data is limited. Furthermore, we find that even with data augmentation, RIConvs can further enhance model performance.

Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured

TL;DR

The paper addresses the challenge of achieving rotation invariance in convolutional neural networks without relying on data augmentation. It introduces a family of rotation-invariant convolutions (RIConvs) built from non-learnable operators that preserve the same number of learnable parameters as standard convolutions and can be interchanged with them.Gradient-based RIConvs demonstrate state-of-the-art performance on MNIST-Rot, while integrating RIConvs with common CNN backbones yields substantial gains on texture, aircraft, and remote sensing tasks, particularly when training data are limited. The results also indicate that data augmentation, while beneficial, is complementary to mechanism-based invariance, underscoring the practical value of RIConvs for robust rotation handling in real-world applications.

Abstract

Achieving rotation invariance in deep neural networks without relying on data has always been a hot research topic. Intrinsic rotation invariance can enhance the model's feature representation capability, enabling better performance in tasks such as multi-orientation object recognition and detection. Based on various types of non-learnable operators, including gradient, sort, local binary pattern, maximum, etc., this paper designs a set of new convolution operations that are natually invariant to arbitrary rotations. Unlike most previous studies, these rotation-invariant convolutions (RIConvs) have the same number of learnable parameters and a similar computational process as conventional convolution operations, allowing them to be interchangeable. Using the MNIST-Rot dataset, we first verify the invariance of these RIConvs under various rotation angles and compare their performance with previous rotation-invariant convolutional neural networks (RI-CNNs). Two types of RIConvs based on gradient operators achieve state-of-the-art results. Subsequently, we combine RIConvs with different types and depths of classic CNN backbones. Using the OuTex_00012, MTARSI, and NWPU-RESISC-45 datasets, we test their performance on texture recognition, aircraft type recognition, and remote sensing image classification tasks. The results show that RIConvs significantly improve the accuracy of these CNN backbones, especially when the training data is limited. Furthermore, we find that even with data augmentation, RIConvs can further enhance model performance.
Paper Structure (14 sections, 7 equations, 4 figures, 2 tables)

This paper contains 14 sections, 7 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Taking a $3\times3$ convolutional kernel as an example, we explain the computation process of seven different RIConvs.
  • Figure 2: Various datasets used for evaluating the performance of RIConvs.
  • Figure 3: The classification accuracies from seven RIConvs on 36 rotated test subsets of MNIST-Rot with specific rotation angles ($0$, $10^{\circ}$, $20^{\circ}$,...,$350^{\circ}$).
  • Figure 4: The performance of classical CNN backbones and the corresponding RI-CNN models on different tasks, where V, I, D, $R_{18}$, $R_{34}$, and $R_{50}$ represent VGG16, Inception V1, DenseNet40, ResNet18, ResNet34, ResNet50, and ResNet101, respectively.