Table of Contents
Fetching ...

Affine Invariance in Continuous-Domain Convolutional Neural Networks

Ali Mohaddes, Johannes Lederer

TL;DR

This work tackles the challenge of achieving affine invariance in continuous-domain convolutional neural networks by embedding inputs into the affine group $G_2 = \\mathbb{R}^2 \\\ltimes \\mathrm{GL}_2(\\mathbb{R})$ and employing a three-layer lifting-convolution-projection GCNN. The authors establish stability under affine transforms from $\\mathrm{GL}_2(\\mathbb{R})$ via layer-wise invariance theorems, and derive practical computation strategies that reduce $G_2$ convolutions to real-space integrals using a QR-based GL$(2)$ decomposition. They validate the approach experimentally on affine-transformed digits, showing that the GCNN outperforms a standard CNN, particularly in data-scarce settings (e.g., mean accuracies such as $0.6950$ vs. $0.3150$ and $0.800$ vs. $0.720$ in reported scenarios). These contributions broaden the class of geometric transformations addressable by GCNNs and offer computationally feasible means to enforce affine invariance in deep learning pipelines.

Abstract

The notion of group invariance helps neural networks in recognizing patterns and features under geometric transformations. Group convolutional neural networks enhance traditional convolutional neural networks by incorporating group-based geometric structures into their design. This research studies affine invariance on continuous-domain convolutional neural networks. Despite other research considering isometric invariance or similarity invariance, we focus on the full structure of affine transforms generated by the group of all invertible $2 \times 2$ real matrices (generalized linear group $\mathrm{GL}_2(\mathbb{R})$). We introduce a new criterion to assess the invariance of two signals under affine transformations. The input image is embedded into the affine Lie group $G_2 = \mathbb{R}^2 \ltimes \mathrm{GL}_2(\mathbb{R})$ to facilitate group convolution operations that respect affine invariance. Then, we analyze the convolution of embedded signals over $G_2$. In sum, our research could eventually extend the scope of geometrical transformations that usual deep-learning pipelines can handle.

Affine Invariance in Continuous-Domain Convolutional Neural Networks

TL;DR

This work tackles the challenge of achieving affine invariance in continuous-domain convolutional neural networks by embedding inputs into the affine group and employing a three-layer lifting-convolution-projection GCNN. The authors establish stability under affine transforms from via layer-wise invariance theorems, and derive practical computation strategies that reduce convolutions to real-space integrals using a QR-based GL decomposition. They validate the approach experimentally on affine-transformed digits, showing that the GCNN outperforms a standard CNN, particularly in data-scarce settings (e.g., mean accuracies such as vs. and vs. in reported scenarios). These contributions broaden the class of geometric transformations addressable by GCNNs and offer computationally feasible means to enforce affine invariance in deep learning pipelines.

Abstract

The notion of group invariance helps neural networks in recognizing patterns and features under geometric transformations. Group convolutional neural networks enhance traditional convolutional neural networks by incorporating group-based geometric structures into their design. This research studies affine invariance on continuous-domain convolutional neural networks. Despite other research considering isometric invariance or similarity invariance, we focus on the full structure of affine transforms generated by the group of all invertible real matrices (generalized linear group ). We introduce a new criterion to assess the invariance of two signals under affine transformations. The input image is embedded into the affine Lie group to facilitate group convolution operations that respect affine invariance. Then, we analyze the convolution of embedded signals over . In sum, our research could eventually extend the scope of geometrical transformations that usual deep-learning pipelines can handle.
Paper Structure (10 sections, 6 theorems, 63 equations, 7 figures)

This paper contains 10 sections, 6 theorems, 63 equations, 7 figures.

Key Result

Theorem 1

if $\Sigma$ be the G-CNN consisting of of three, lifting, convolutional, and $\mathbb{R}$-projection layers and if the distance of a function $f_1$ and the affine transform of another function $f_2$ be less than $\epsilon$, then $|\Sigma f_1 - \Sigma f_2| < c \epsilon$. Where $c = \|k_1\|_1^{\mathbb

Figures (7)

  • Figure 1: Original letters and its affine invariant CAPTCHA.
  • Figure 2: Two family of related affine invariant functions denoted by dashed and solid lines.
  • Figure 3: Accuracy comparison between G-CNN and standard CNN across varying sample sizes. G-CNN outperforms CNN under affine transformation $A = 2.50.70.61.8$.
  • Figure 4: Accuracy comparison between G-CNN and standard CNN across varying sample sizes. G-CNN outperforms CNN for relatively small sample sizes under affine transformation $A = 10.70.71$.
  • Figure 5: Prediction comparison under affine transformation $A = 1221$. G-CNN outperforms CNN with higher mean accuracy (0.6950 vs. 0.3150).
  • ...and 2 more figures

Theorems & Definitions (33)

  • Definition 1: Group
  • Example 1: Translation group
  • Definition 2: Lie groups
  • Example 2: Roto-translation group
  • Definition 3: Group action
  • Definition 4: Representation
  • Definition 5: Regular representation
  • Example 3: Regular representation of roto-translation group
  • Definition 6: Coset
  • Definition 7: Quotient Space
  • ...and 23 more