Gauge Equivariant Convolutional Networks and the Icosahedral CNN
Taco S. Cohen, Maurice Weiler, Berkay Kicanaoglu, Max Welling
TL;DR
This work develops a general theory of gauge-equivariant convolution on manifolds, enabling neural networks to respect local geometric frames rather than rely on global symmetries. It introduces a concrete instantiation, the Icosahedral CNN, which leverages a regular hexagonal grid and an atlas of charts to realize gauge-equivariant convolutions efficiently via a single conv2d operation. The method demonstrates strong performance on omnidirectional image segmentation and climate pattern tasks, while maintaining scalability superior to prior spherical CNN approaches. By unifying frame-bundle concepts with practical neural architectures and showing effective weight sharing through kernel constraints, the paper provides a versatile, scalable framework for learning on curved and irregular geometries with intrinsic geometric awareness.
Abstract
The principle of equivariance to symmetry transformations enables a theoretically grounded approach to neural network architecture design. Equivariant networks have shown excellent performance and data efficiency on vision and medical imaging problems that exhibit symmetries. Here we show how this principle can be extended beyond global symmetries to local gauge transformations. This enables the development of a very general class of convolutional neural networks on manifolds that depend only on the intrinsic geometry, and which includes many popular methods from equivariant and geometric deep learning. We implement gauge equivariant CNNs for signals defined on the surface of the icosahedron, which provides a reasonable approximation of the sphere. By choosing to work with this very regular manifold, we are able to implement the gauge equivariant convolution using a single conv2d call, making it a highly scalable and practical alternative to Spherical CNNs. Using this method, we demonstrate substantial improvements over previous methods on the task of segmenting omnidirectional images and global climate patterns.
