On the effectiveness of Rotation-Equivariance in U-Net: A Benchmark for Image Segmentation
Robin Ghyselinck, Valentin Delchevalerie, Bruno Dumas, Benoît Frénay
TL;DR
The paper investigates rotation-equivariant U-Nets for image segmentation by benchmarking vanilla and $E(2)$-equivariant variants across five datasets with varying orientation properties and scales. Using $C_4$, $C_8$, and $D_4$ groups within a consistent U-Net framework, it assesses both performance and training-resource costs, revealing that equivariance yields clear benefits on orientation-variant tasks (e.g., Kvasir-SEG, URDE) but can underperform on rotationally symmetric or general datasets (e.g., NucleiSeg, iSAID). The results highlight a nuanced trade-off: equivariant models can match or exceed vanilla performance on some datasets, especially with larger models and data, yet often require more compute and may not justify the gains in all contexts. The study suggests future work toward hybrid architectures that jointly learn equivariant and non-equivariant features to exploit complementary information for diverse segmentation challenges.
Abstract
Numerous studies have recently focused on incorporating different variations of equivariance in Convolutional Neural Networks (CNNs). In particular, rotation-equivariance has gathered significant attention due to its relevance in many applications related to medical imaging, microscopic imaging, satellite imaging, industrial tasks, etc. While prior research has primarily focused on enhancing classification tasks with rotation equivariant CNNs, their impact on more complex architectures, such as U-Net for image segmentation, remains scarcely explored. Indeed, previous work interested in integrating rotation-equivariance into U-Net architecture have focused on solving specific applications with a limited scope. In contrast, this paper aims to provide a more exhaustive evaluation of rotation equivariant U-Net for image segmentation across a broader range of tasks. We benchmark their effectiveness against standard U-Net architectures, assessing improvements in terms of performance and sustainability (i.e., computational cost). Our evaluation focuses on datasets whose orientation of objects of interest is arbitrary in the image (e.g., Kvasir-SEG), but also on more standard segmentation datasets (such as COCO-Stuff) as to explore the wider applicability of rotation equivariance beyond tasks undoubtedly concerned by rotation equivariance. The main contribution of this work is to provide insights into the trade-offs and advantages of integrating rotation equivariance for segmentation tasks.
