Table of Contents
Fetching ...

A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs

Lars Veefkind, Gabriele Cesa

TL;DR

This work tackles the rigidity of fixed equivariance in steerable CNNs by learning the degree of equivariance through a probabilistic likelihood over the transformation group. By parameterising $\lambda(h)$ with Fourier coefficients and regularising via normalization, alignment, and KL terms, the method yields interpretable, layer-wise partial equivariance that adapts to data. Empirical results across 2D/3D benchmarks (e.g., DDMNIST, MedMNIST, Smoke/JetFlow) demonstrate competitive performance and clear interpretability of the learnt symmetry patterns, with bandlimiting providing regularisation and efficiency benefits. The approach generalises to any compact group and can model partial symmetries without adding extra layers, representing a practical improvement for symmetry-aware learning in vision and biomedical tasks.

Abstract

Steerable convolutional neural networks (SCNNs) enhance task performance by modelling geometric symmetries through equivariance constraints on weights. Yet, unknown or varying symmetries can lead to overconstrained weights and decreased performance. To address this, this paper introduces a probabilistic method to learn the degree of equivariance in SCNNs. We parameterise the degree of equivariance as a likelihood distribution over the transformation group using Fourier coefficients, offering the option to model layer-wise and shared equivariance. These likelihood distributions are regularised to ensure an interpretable degree of equivariance across the network. Advantages include the applicability to many types of equivariant networks through the flexible framework of SCNNs and the ability to learn equivariance with respect to any subgroup of any compact group without requiring additional layers. Our experiments reveal competitive performance on datasets with mixed symmetries, with learnt likelihood distributions that are representative of the underlying degree of equivariance.

A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs

TL;DR

This work tackles the rigidity of fixed equivariance in steerable CNNs by learning the degree of equivariance through a probabilistic likelihood over the transformation group. By parameterising with Fourier coefficients and regularising via normalization, alignment, and KL terms, the method yields interpretable, layer-wise partial equivariance that adapts to data. Empirical results across 2D/3D benchmarks (e.g., DDMNIST, MedMNIST, Smoke/JetFlow) demonstrate competitive performance and clear interpretability of the learnt symmetry patterns, with bandlimiting providing regularisation and efficiency benefits. The approach generalises to any compact group and can model partial symmetries without adding extra layers, representing a practical improvement for symmetry-aware learning in vision and biomedical tasks.

Abstract

Steerable convolutional neural networks (SCNNs) enhance task performance by modelling geometric symmetries through equivariance constraints on weights. Yet, unknown or varying symmetries can lead to overconstrained weights and decreased performance. To address this, this paper introduces a probabilistic method to learn the degree of equivariance in SCNNs. We parameterise the degree of equivariance as a likelihood distribution over the transformation group using Fourier coefficients, offering the option to model layer-wise and shared equivariance. These likelihood distributions are regularised to ensure an interpretable degree of equivariance across the network. Advantages include the applicability to many types of equivariant networks through the flexible framework of SCNNs and the ability to learn equivariance with respect to any subgroup of any compact group without requiring additional layers. Our experiments reveal competitive performance on datasets with mixed symmetries, with learnt likelihood distributions that are representative of the underlying degree of equivariance.
Paper Structure (97 sections, 4 theorems, 122 equations, 21 figures, 24 tables)

This paper contains 97 sections, 4 theorems, 122 equations, 21 figures, 24 tables.

Key Result

Theorem B.30

(Irreps Decomposition (Peter-Weyl theorem part 1)) Any unitary (or orthogonal) representation $\rho:G\to V$ of a compact group $G$ over a field with characteristic zero (e.g. the complex $\mathbb{C}^n$ and real $\mathbb{R}^n$ fields) is a direct sum of irreducible representations. Each irrep corresp where $\textit{I}$ is an index set that specifies the irreducible representations $\psi_i$ containe

Figures (21)

  • Figure 1: Coronal CT view of left and right lung yang2023medmnist. Smaller features $\bigcirc$ (i.e., textures/components) are orientation invariant in determining whether the object is a lung, unlike the task of distinguishing between the two lungs as a whole, as they are (approximately) mirrored versions of each other.
  • Figure 2: Confusion matrices for DDMNIST with $O(2)$ symmetries. Labelled 0-99 from top to bottom and left to right.
  • Figure 3: Learnt likelihood $\lambda$ and error difference for layers 4 and 5 of our $O(2)$ PSCNN trained on DDMNIST with $O(2)$ and $C_1$ symmetries. Dotted line marks the $O(2)$ reflection domain transition. Note the scale of the error between the plots.
  • Figure 4: Likelihoods and errors of the fifth $O(2)$ PSCNN layer trained on $SO(2)$DDMNIST under various bandlimits $L$.
  • Figure 5: Data ablation study on Organ and Nodule.
  • ...and 16 more figures

Theorems & Definitions (46)

  • Definition B.1
  • Example B.2
  • Definition B.3
  • Definition B.4
  • Example B.5
  • Definition B.6
  • Definition B.7
  • Definition B.8
  • Definition B.9
  • Definition B.10
  • ...and 36 more