Table of Contents
Fetching ...

Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural network

Wenzhao Zhao, Barbara D. Wichtmann, Steffen Albert, Angelika Maurer, Frank G. Zöllner, Jürgen Hesser

TL;DR

This work targets the computational bottleneck of group-equivariant CNNs that rely on parameter sharing. It introduces a non-parameter-sharing framework built on adaptive aggregation of Monte Carlo augmented decomposed filters (MCG-CNN) and its weight-augmented variant (WMCG-CNN), enabling flexible, diverse transformation handling for both continuous and discrete groups, including shear transforms. The methodology combines Monte Carlo integration for group convolutions with filter decomposition into basis functions (e.g., Fourier-Bessel, Mexican hat) and bootstrapped samples, enabling efficient deployment in standard CNN backbones while preserving equivariance guarantees. Empirically, WMCG-CNN achieves superior or competitive performance on ImageNet and various denoising tasks with comparable parameter counts and computational cost, demonstrating improved sample efficiency and robustness to affine transformations. The work suggests a practical pathway to broader, more efficient group-equivariant architectures and points to future directions in basis design and application to other vision tasks.

Abstract

Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. However, the parameter-sharing strategy greatly increases the computational burden for each added parameter, which hampers its application to deep neural network models. In this paper, we address these problems by proposing a non-parameter-sharing approach for group equivariant neural networks. The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of stochastically augmented decomposed filters. We give theoretical proof about how the group equivariance can be achieved by our methods. Our method applies to both continuous and discrete groups, where the augmentation is implemented using Monte Carlo sampling and bootstrap resampling, respectively. Our methods also serve as an efficient extension of standard CNN. The experiments show that our method outperforms parameter-sharing group equivariant networks and enhances the performance of standard CNNs in image classification and denoising tasks, by using suitable filter bases to build efficient lightweight networks. The code will be available at https://github.com/ZhaoWenzhao/MCG_CNN.

Adaptive aggregation of Monte Carlo augmented decomposed filters for efficient group-equivariant convolutional neural network

TL;DR

This work targets the computational bottleneck of group-equivariant CNNs that rely on parameter sharing. It introduces a non-parameter-sharing framework built on adaptive aggregation of Monte Carlo augmented decomposed filters (MCG-CNN) and its weight-augmented variant (WMCG-CNN), enabling flexible, diverse transformation handling for both continuous and discrete groups, including shear transforms. The methodology combines Monte Carlo integration for group convolutions with filter decomposition into basis functions (e.g., Fourier-Bessel, Mexican hat) and bootstrapped samples, enabling efficient deployment in standard CNN backbones while preserving equivariance guarantees. Empirically, WMCG-CNN achieves superior or competitive performance on ImageNet and various denoising tasks with comparable parameter counts and computational cost, demonstrating improved sample efficiency and robustness to affine transformations. The work suggests a practical pathway to broader, more efficient group-equivariant architectures and points to future directions in basis design and application to other vision tasks.

Abstract

Group-equivariant convolutional neural networks (G-CNN) heavily rely on parameter sharing to increase CNN's data efficiency and performance. However, the parameter-sharing strategy greatly increases the computational burden for each added parameter, which hampers its application to deep neural network models. In this paper, we address these problems by proposing a non-parameter-sharing approach for group equivariant neural networks. The proposed methods adaptively aggregate a diverse range of filters by a weighted sum of stochastically augmented decomposed filters. We give theoretical proof about how the group equivariance can be achieved by our methods. Our method applies to both continuous and discrete groups, where the augmentation is implemented using Monte Carlo sampling and bootstrap resampling, respectively. Our methods also serve as an efficient extension of standard CNN. The experiments show that our method outperforms parameter-sharing group equivariant networks and enhances the performance of standard CNNs in image classification and denoising tasks, by using suitable filter bases to build efficient lightweight networks. The code will be available at https://github.com/ZhaoWenzhao/MCG_CNN.
Paper Structure (20 sections, 2 theorems, 31 equations, 7 figures, 8 tables)

This paper contains 20 sections, 2 theorems, 31 equations, 7 figures, 8 tables.

Key Result

Theorem 2.1

Let $\mu_p$ be a probabilistic measure on $(\mathbb{R}^d,\mathcal{B}(\mathbb{R}^d))$, i.e., $\mu_p(\mathbb{R}^d)=1$, and $\mathcal{B}(\mathbb{R}^d)$ denotes the Borel algebra on $\mathbb{R}^d$ with $d$ the number of dimensions. For $f\in L^2(\mathbb{R}^d,\mathcal{B}(\mathbb{R}^d),\mu_p)$, we define and where $(\xi_i)_{i\in N}$ is an i.i.d sequence of random variables with distributions $\mu_p$.

Figures (7)

  • Figure 1: An example of affine transformation in real life. The image is from the CBSD432martin2001database dataset. As shown in the white rectangular box, the horizontal lines of bricks undergo a shear transform along the vertical direction.
  • Figure 2: Examples of filter bases in 1-dimension and 2-dimension space. (a) the discrete Dirac delta basis, (b) the Fourier Bessel basis; (c) the Mexican hat basis.
  • Figure 3: Integrating the proposed WMCG-CNN into the classic bottleneck architecture. (a) The example bottleneck block with group convolution using $3\times 3$ filters; (b) An example of filter composition with MC-augmented basis.
  • Figure 4: (a) The mGEs of the first hidden CNN layer with 256 input and output channels of different residual networks for $90.0$ epochs of training on ImageNet dataset. (b) The histogram of the learned weights for the FB basis of order $0$ in the first hidden CNN layer of ResNet18-k5-WMCG-shear-0.25$\pi$.
  • Figure 5: The output feature map of CNNs for affinely transformed inputs. The original input image is cropped from a car image selected from STL10coates2011analysis dataset. The areas in the white rectangular boxes are upscaled to show the details. (a) Input 1; (b) The output of plain CNN for Input 1; (c) The output of RST-CNN for Input 1; (d) The output of WMCG-CNN for Input 1; (e) Input 2; (f) The output of plain CNN for Input 2; (g) The output of RST-CNN for Input 2; (h) The output of WMCG-CNN for Input 2.
  • ...and 2 more figures

Theorems & Definitions (3)

  • Theorem 2.1
  • Theorem 2.2
  • proof