Table of Contents
Fetching ...

On Symmetries in Convolutional Weights

Bilal Alsallakh, Timothy Wroge, Vivek Miglani, Narine Kokhlikyan

TL;DR

The paper investigates mean-kernel symmetry in convolutional networks using a Dihedral group $D_4$-based metric to quantify per-layer symmetry: $S(K)=1 - \frac{1}{2 \cdot |\mathscr{T}|} \sum_{T \in \mathscr{T}} ||T(\hat{K}) - \hat{K}||_F$ with $\hat{K}$ as the normalized kernel. Across architectures like VGG-16, Inception-V3, and ResNet variants, it shows that internal mean kernels tend to become more symmetric deeper in the network, though artificial asymmetry arises at strided downsampling layers due to padding biases, which can be mitigated by techniques such as PartialConvolution, reflection padding, anti-aliasing, and SBPool. The study demonstrates that symmetry correlates with shift and flip consistency and can improve segmentation robustness when padding-induced skew is addressed, reinforcing symmetry as a potentially valuable inductive bias in CNNs. The work highlights directions for future research on the emergence of symmetry, its dependence on data augmentation and architecture, and its broader applicability to diverse tasks beyond image classification.

Abstract

We explore the symmetry of the mean k x k weight kernel in each layer of various convolutional neural networks. Unlike individual neurons, the mean kernels in internal layers tend to be symmetric about their centers instead of favoring specific directions. We investigate why this symmetry emerges in various datasets and models, and how it is impacted by certain architectural choices. We show how symmetry correlates with desirable properties such as shift and flip consistency, and might constitute an inherent inductive bias in convolutional neural networks.

On Symmetries in Convolutional Weights

TL;DR

The paper investigates mean-kernel symmetry in convolutional networks using a Dihedral group -based metric to quantify per-layer symmetry: with as the normalized kernel. Across architectures like VGG-16, Inception-V3, and ResNet variants, it shows that internal mean kernels tend to become more symmetric deeper in the network, though artificial asymmetry arises at strided downsampling layers due to padding biases, which can be mitigated by techniques such as PartialConvolution, reflection padding, anti-aliasing, and SBPool. The study demonstrates that symmetry correlates with shift and flip consistency and can improve segmentation robustness when padding-induced skew is addressed, reinforcing symmetry as a potentially valuable inductive bias in CNNs. The work highlights directions for future research on the emergence of symmetry, its dependence on data augmentation and architecture, and its broader applicability to diverse tasks beyond image classification.

Abstract

We explore the symmetry of the mean k x k weight kernel in each layer of various convolutional neural networks. Unlike individual neurons, the mean kernels in internal layers tend to be symmetric about their centers instead of favoring specific directions. We investigate why this symmetry emerges in various datasets and models, and how it is impacted by certain architectural choices. We show how symmetry correlates with desirable properties such as shift and flip consistency, and might constitute an inherent inductive bias in convolutional neural networks.

Paper Structure

This paper contains 11 sections, 1 equation, 10 figures, 1 table.

Figures (10)

  • Figure 1: Left: Illustrating how we compute the mean kernel in a layer. Right: The mean kernels of AlexNet trained on different datasets. Title format: (N, C, W, H).
  • Figure 2: The symmetry of the mean kernel in each layer of two convolutional architectures, VGG-16 and Inception-V3, according to Eq \ref{['eq:symmetry']}. Both models are trained on ImageNet.
  • Figure 3: The similarity profile of a VGG-16 model under various conditions. Left: The model is initialized with random weights. Middle: The model is trained to count heads in an image. Right: The model is trained on SVHN digit recognition.
  • Figure 4: Left: The mean kernels of a ResNet-18 model, pretrained on ImageNet. Right: The symmetry of the mean kernels. Notice the significant drops at layer2.01 and layer3.01, both of which have a stride of 2 (we indicate that with an asterisk next to the layer name).
  • Figure 5: The mean kernels of three VGG-based models trained on different datasets.
  • ...and 5 more figures