A Novel Convolutional Neural Network Architecture with a Continuous Symmetry
Yao Liu, Hang Shao, Bing Bai
TL;DR
This work introduces a Convolutional Neural Network architecture inspired by quasi-linear hyperbolic PDEs, enabling a continuous symmetry in the weight space via transformations from a Lie group such as $GL(n,\mathbb{R})$. By replacing standard per-activation nonlinearities with a nonlinear coupling across branches and employing variable-coefficient convolutions, the model can mix channels and, in some configurations, remove most activations without sacrificing performance. Experimental results on a 100-class ImageNet subset show competitive accuracy (up to $84.96\%$ top-1) with modest parameter counts, using a ResNet50 backbone and activation-placing strategies that mitigate training instabilities. The paper argues that incorporating PDE perspectives can yield novel architectural designs and deeper interpretations of ConvNets, with potential extensions to other architectures like Transformers.
Abstract
This paper introduces a new Convolutional Neural Network (ConvNet) architecture inspired by a class of partial differential equations (PDEs) called quasi-linear hyperbolic systems. With comparable performance on the image classification task, it allows for the modification of the weights via a continuous group of symmetry. This is a significant shift from traditional models where the architecture and weights are essentially fixed. We wish to promote the (internal) symmetry as a new desirable property for a neural network, and to draw attention to the PDE perspective in analyzing and interpreting ConvNets in the broader Deep Learning community.
