Affine Invariance in Continuous-Domain Convolutional Neural Networks

Ali Mohaddes; Johannes Lederer

Affine Invariance in Continuous-Domain Convolutional Neural Networks

Ali Mohaddes, Johannes Lederer

TL;DR

This work tackles the challenge of achieving affine invariance in continuous-domain convolutional neural networks by embedding inputs into the affine group $G_2 = \\mathbb{R}^2 \\\ltimes \\mathrm{GL}_2(\\mathbb{R})$ and employing a three-layer lifting-convolution-projection GCNN. The authors establish stability under affine transforms from $\\mathrm{GL}_2(\\mathbb{R})$ via layer-wise invariance theorems, and derive practical computation strategies that reduce $G_2$ convolutions to real-space integrals using a QR-based GL$(2)$ decomposition. They validate the approach experimentally on affine-transformed digits, showing that the GCNN outperforms a standard CNN, particularly in data-scarce settings (e.g., mean accuracies such as $0.6950$ vs. $0.3150$ and $0.800$ vs. $0.720$ in reported scenarios). These contributions broaden the class of geometric transformations addressable by GCNNs and offer computationally feasible means to enforce affine invariance in deep learning pipelines.

Abstract

The notion of group invariance helps neural networks in recognizing patterns and features under geometric transformations. Group convolutional neural networks enhance traditional convolutional neural networks by incorporating group-based geometric structures into their design. This research studies affine invariance on continuous-domain convolutional neural networks. Despite other research considering isometric invariance or similarity invariance, we focus on the full structure of affine transforms generated by the group of all invertible $2 \times 2$ real matrices (generalized linear group $\mathrm{GL}_2(\mathbb{R})$). We introduce a new criterion to assess the invariance of two signals under affine transformations. The input image is embedded into the affine Lie group $G_2 = \mathbb{R}^2 \ltimes \mathrm{GL}_2(\mathbb{R})$ to facilitate group convolution operations that respect affine invariance. Then, we analyze the convolution of embedded signals over $G_2$. In sum, our research could eventually extend the scope of geometrical transformations that usual deep-learning pipelines can handle.

Affine Invariance in Continuous-Domain Convolutional Neural Networks

TL;DR

This work tackles the challenge of achieving affine invariance in continuous-domain convolutional neural networks by embedding inputs into the affine group

and employing a three-layer lifting-convolution-projection GCNN. The authors establish stability under affine transforms from

via layer-wise invariance theorems, and derive practical computation strategies that reduce

convolutions to real-space integrals using a QR-based GL

decomposition. They validate the approach experimentally on affine-transformed digits, showing that the GCNN outperforms a standard CNN, particularly in data-scarce settings (e.g., mean accuracies such as

vs.

and

vs.

in reported scenarios). These contributions broaden the class of geometric transformations addressable by GCNNs and offer computationally feasible means to enforce affine invariance in deep learning pipelines.

Abstract

real matrices (generalized linear group

). We introduce a new criterion to assess the invariance of two signals under affine transformations. The input image is embedded into the affine Lie group

to facilitate group convolution operations that respect affine invariance. Then, we analyze the convolution of embedded signals over

. In sum, our research could eventually extend the scope of geometrical transformations that usual deep-learning pipelines can handle.

Paper Structure (10 sections, 6 theorems, 63 equations, 7 figures)

This paper contains 10 sections, 6 theorems, 63 equations, 7 figures.

Introduction
Preliminaries
Group convolutional neural networks architecture
Main Result
Problem Statement
Convolution Computation
Integral over $G_2$
Experiments
Conclusion
Appendix

Key Result

Theorem 1

if $\Sigma$ be the G-CNN consisting of of three, lifting, convolutional, and $\mathbb{R}$-projection layers and if the distance of a function $f_1$ and the affine transform of another function $f_2$ be less than $\epsilon$, then $|\Sigma f_1 - \Sigma f_2| < c \epsilon$. Where $c = \|k_1\|_1^{\mathbb

Figures (7)

Figure 1: Original letters and its affine invariant CAPTCHA.
Figure 2: Two family of related affine invariant functions denoted by dashed and solid lines.
Figure 3: Accuracy comparison between G-CNN and standard CNN across varying sample sizes. G-CNN outperforms CNN under affine transformation $A = 2.50.70.61.8$.
Figure 4: Accuracy comparison between G-CNN and standard CNN across varying sample sizes. G-CNN outperforms CNN for relatively small sample sizes under affine transformation $A = 10.70.71$.
Figure 5: Prediction comparison under affine transformation $A = 1221$. G-CNN outperforms CNN with higher mean accuracy (0.6950 vs. 0.3150).
...and 2 more figures

Theorems & Definitions (33)

Definition 1: Group
Example 1: Translation group
Definition 2: Lie groups
Example 2: Roto-translation group
Definition 3: Group action
Definition 4: Representation
Definition 5: Regular representation
Example 3: Regular representation of roto-translation group
Definition 6: Coset
Definition 7: Quotient Space
...and 23 more

Affine Invariance in Continuous-Domain Convolutional Neural Networks

TL;DR

Abstract

Affine Invariance in Continuous-Domain Convolutional Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (33)