Table of Contents
Fetching ...

Accurate Shift Invariant Convolutional Neural Networks Using Gaussian-Hermite Moments

Jaspreet Singh, Petra Bosilj, Grzegorz Cielniak

Abstract

The convolutional neural networks (CNNs) are not inherently shift invariant or equivariant. The downsampling operation, used in CNNs, is one of the key reasons which breaks the shift invariant property of a CNN. Conversely, downsampling operation is important to improve computational efficiency and increase the area of the receptive field for more contextual information. In this work, we propose Gaussian-Hermite Sampling (GHS), a novel downsampling strategy designed to achieve accurate shift invariance. GHS leverages Gaussian-Hermite polynomials to perform shift-consistent sampling, enabling CNN layers to maintain invariance to arbitrary spatial shifts prior to training. When integrated into standard CNN architectures, the proposed method embeds shift invariance directly at the layer level without requiring architectural modifications or additional training procedures. We evaluate the proposed approach on CIFAR-10, CIFAR-100, and MNIST-rot datasets. Experimental results demonstrate that GHS significantly improves shift consistency, achieving 100% classification consistency under spatial shifts, while also improving classification accuracy compared to baseline CNN models.

Accurate Shift Invariant Convolutional Neural Networks Using Gaussian-Hermite Moments

Abstract

The convolutional neural networks (CNNs) are not inherently shift invariant or equivariant. The downsampling operation, used in CNNs, is one of the key reasons which breaks the shift invariant property of a CNN. Conversely, downsampling operation is important to improve computational efficiency and increase the area of the receptive field for more contextual information. In this work, we propose Gaussian-Hermite Sampling (GHS), a novel downsampling strategy designed to achieve accurate shift invariance. GHS leverages Gaussian-Hermite polynomials to perform shift-consistent sampling, enabling CNN layers to maintain invariance to arbitrary spatial shifts prior to training. When integrated into standard CNN architectures, the proposed method embeds shift invariance directly at the layer level without requiring architectural modifications or additional training procedures. We evaluate the proposed approach on CIFAR-10, CIFAR-100, and MNIST-rot datasets. Experimental results demonstrate that GHS significantly improves shift consistency, achieving 100% classification consistency under spatial shifts, while also improving classification accuracy compared to baseline CNN models.
Paper Structure (16 sections, 2 theorems, 24 equations, 5 figures, 6 tables)

This paper contains 16 sections, 2 theorems, 24 equations, 5 figures, 6 tables.

Key Result

Theorem 3.1

The operator $\mathbf{A}^*_{\max}(\mathbf{F})$ is wrap-shift invariant.

Figures (5)

  • Figure 1: Illustration of invariance (left) and equivariance (right). In invariance, the operator output remains unchanged under the transformation. In equivariance, applying the operator after a transformation yields the same result as transforming the operator output.
  • Figure 2: Gaussian-Hermite polynomials of order 0 to 4 with Gaussian weighting $(\sigma=0.75)$.
  • Figure 3: GHS process for an image. (a) Unshifted ($\mathbf{F}$) and shifted ($\mathbf{F}'$) images with their maximum locations. (b) Construction of the GHM $\mathbf{A}^*_{\max}(\mathbf{F})$ using $\mathbf{Q}$ and $\mathbf{Q}^T$. (c) Downsampled image $\widehat{\mathbf{F}}^*$ obtained using (Eq. \ref{['eq:reconstructionMatrixEquiv']}).
  • Figure 4: Comparison of downsampling methods. (a) Original image, (b-e) unshifted downsampled results, (f) shifted image and (g-j) shifted downsampled results.
  • Figure 5: Consistency (left) and accuracy (right) of ResNet-18 with different downsampling methods on CIFAR-10 under random patch erasures.

Theorems & Definitions (4)

  • Theorem 3.1
  • proof
  • Theorem 3.2
  • proof