Table of Contents
Fetching ...

Efficient 3D affinely equivariant CNNs with adaptive fusion of augmented spherical Fourier-Bessel bases

Wenzhao Zhao, Steffen Albert, Barbara D. Wichtmann, Angelika Maurer, Ulrike Attenberger, Frank G. Zöllner, Jürgen Hesser

TL;DR

This work tackles inefficiencies in 3D group-equivariant CNNs by introducing a non-parameter-sharing, continuous affine G-CNN that operates under $GL^+(3,\mathbb{R})$ and uses Monte Carlo augmented spherical Fourier-Bessel bases to achieve joint angular and radial orthogonality. The approach, dubbed Weighted Monte Carlo G-CNN with spherical Fourier-Bessel bases, enables adaptive fusion of filter bases per channel and reduces training-time costs while maintaining comparable inference costs to standard CNNs. Empirical results across six volumetric datasets (medical and seismic) show improved affine group equivariance and competitive or superior segmentation accuracy, with notable gains in training stability and data efficiency. The method offers a flexible, plug-in operator for dense 3D volumetric processing, with practical implications for robust 3D medical and geoscience imaging tasks.

Abstract

Filter-decomposition-based group equivariant convolutional neural networks (CNNs) have shown promising stability and data efficiency for 3D image feature extraction. However, these networks, which rely on parameter sharing and discrete transformation groups, often underperform in modern deep neural network architectures for processing volumetric images with dense 3D textures, such as the common 3D medical images. To address these limitations, this paper presents an efficient non-parameter-sharing continuous 3D affine group equivariant neural network for volumetric images. This network uses an adaptive aggregation of Monte Carlo augmented spherical Fourier-Bessel filter bases to improve the efficiency and flexibility of 3D group equivariant CNNs for volumetric data. Unlike existing methods that focus only on angular orthogonality in filter bases, the introduced spherical Bessel Fourier filter base incorporates both angular and radial orthogonality to improve feature extraction. Experiments on four medical image segmentation datasets and two seismic datasets show that the proposed methods achieve better affine group equivariance and superior segmentation accuracy than existing 3D group equivariant convolutional neural network layers, significantly improving the training stability and data efficiency of conventional CNN layers (at 0.05 significance level). The code is available at https://github.com/ZhaoWenzhao/WMCSFB.

Efficient 3D affinely equivariant CNNs with adaptive fusion of augmented spherical Fourier-Bessel bases

TL;DR

This work tackles inefficiencies in 3D group-equivariant CNNs by introducing a non-parameter-sharing, continuous affine G-CNN that operates under and uses Monte Carlo augmented spherical Fourier-Bessel bases to achieve joint angular and radial orthogonality. The approach, dubbed Weighted Monte Carlo G-CNN with spherical Fourier-Bessel bases, enables adaptive fusion of filter bases per channel and reduces training-time costs while maintaining comparable inference costs to standard CNNs. Empirical results across six volumetric datasets (medical and seismic) show improved affine group equivariance and competitive or superior segmentation accuracy, with notable gains in training stability and data efficiency. The method offers a flexible, plug-in operator for dense 3D volumetric processing, with practical implications for robust 3D medical and geoscience imaging tasks.

Abstract

Filter-decomposition-based group equivariant convolutional neural networks (CNNs) have shown promising stability and data efficiency for 3D image feature extraction. However, these networks, which rely on parameter sharing and discrete transformation groups, often underperform in modern deep neural network architectures for processing volumetric images with dense 3D textures, such as the common 3D medical images. To address these limitations, this paper presents an efficient non-parameter-sharing continuous 3D affine group equivariant neural network for volumetric images. This network uses an adaptive aggregation of Monte Carlo augmented spherical Fourier-Bessel filter bases to improve the efficiency and flexibility of 3D group equivariant CNNs for volumetric data. Unlike existing methods that focus only on angular orthogonality in filter bases, the introduced spherical Bessel Fourier filter base incorporates both angular and radial orthogonality to improve feature extraction. Experiments on four medical image segmentation datasets and two seismic datasets show that the proposed methods achieve better affine group equivariance and superior segmentation accuracy than existing 3D group equivariant convolutional neural network layers, significantly improving the training stability and data efficiency of conventional CNN layers (at 0.05 significance level). The code is available at https://github.com/ZhaoWenzhao/WMCSFB.
Paper Structure (22 sections, 2 theorems, 28 equations, 3 figures, 6 tables)

This paper contains 22 sections, 2 theorems, 28 equations, 3 figures, 6 tables.

Key Result

Theorem 1

Any element in $GL^+(3,\mathbb{R})$ can be decomposed into the form of eq:matrixdecomp. Any $3\times 3$ matrix constructed via eq:matrixdecomp belongs to $GL^+(3,\mathbb{R})$.

Figures (3)

  • Figure 1: The comparison of group equivariant convolutions, where $\cdot$ means multiplication and $*$ means convolution. $w_i$ denotes the $i$-th learnable weight, $\psi(g_i)$ denotes the filter augmented by the $i$-th transformation group member $g_i$, and $f_i$ denotes the $i$-th feature map. (a) The parameter-sharing strategy for common G-CNNskondor2018generalizationweiler20183dcesa2021program; (b) The WMCG-CNNzhao2023adaptive.
  • Figure 2: The example of a spherical Fourier-Bessel basis with $l=1$, $m=2$ and $n=1$. (a) The overall 3D view of the basis function; (b) The cross-section perpendicular to the x-axis; (c) The cross-section perpendicular to the y-axis; (d) The cross-section perpendicular to the z-axis.
  • Figure 3: The examples of axial slices with colored segmentation labels for different datasets. (a) BTCV; (b) BRaTS; (c) Netherlands F3

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2
  • proof