Learning Color Equivariant Representations
Yulong Yang, Felix O'Mahony, Christine Allen-Blanchette
TL;DR
This work addresses the sensitivity of conventional CNNs to color perturbations by introducing color-equivariant GCNNs built on hue, saturation, and luminance groups. A lifting-based framework lifts inputs into the color group space, enabling genuine equivariance to hue, saturation, and luminance shifts, and avoiding artifacts that plagued prior CEConv approaches. The approach yields dramatically reduced equivariance error, improved generalization under out-of-distribution color variations, and enhanced sample efficiency across diverse synthetic and real datasets, including Hue-shift MNIST, Hue-shift 3D Shapes, Camelyon17, and several large-scale benchmarks. These color-aware representations enable new tasks such as color-based sorting and offer practical impact for robust perception under perceptual variations, with future work aimed at continuous group extensions and computational optimization.
Abstract
In this paper, we introduce group convolutional neural networks (GCNNs) equivariant to color variation. GCNNs have been designed for a variety of geometric transformations from 2D and 3D rotation groups, to semi-groups such as scale. Despite the improved interpretability, accuracy and generalizability of these architectures, GCNNs have seen limited application in the context of perceptual quantities. Notably, the recent CEConv network uses a GCNN to achieve equivariance to hue transformations by convolving input images with a hue rotated RGB filter. However, this approach leads to invalid RGB values which break equivariance and degrade performance. We resolve these issues with a lifting layer that transforms the input image directly, thereby circumventing the issue of invalid RGB values and improving equivariance error by over three orders of magnitude. Moreover, we extend the notion of color equivariance to include equivariance to saturation and luminance shift. Our hue-, saturation-, luminance- and color-equivariant networks achieve strong generalization to out-of-distribution perceptual variations and improved sample efficiency over conventional architectures. We demonstrate the utility of our approach on synthetic and real world datasets where we consistently outperform competitive baselines.
