Learning with 3D rotations, a hitchhiker's guide to SO(3)
A. René Geist, Jonas Frey, Mikel Zhobro, Anna Levina, Georg Martius
TL;DR
This survey analyzes rotation representations for $SO(3)$ in gradient-based neural regression, arguing that low-dimensional mappings (≤4D) introduce discontinuities that harm learnability, especially when rotations are in the output. It compares several representations, showing that $\\mathbb{R}^9$+SVD and $\\mathbb{R}^6$+GSO consistently yield better optimization properties and generalization than Euler, axis-angle, or quaternion-only approaches; distance-picking and half-space tricks do not fully resolve fundamental discontinuities. Empirical experiments across rotation estimation and feature-prediction tasks corroborate the theoretical guidance, favoring high-dimensional representations and, in small-angle dynamics, half-space-mapped quaternions as a practical compromise. The work provides concrete recommendations for choosing rotation representations depending on whether rotations are inputs or outputs and on the expected rotation magnitude, with implications for pose estimation, 3D vision, and robotics applications, while highlighting trade-offs in computation and training stability.
Abstract
Many settings in machine learning require the selection of a rotation representation. However, choosing a suitable representation from the many available options is challenging. This paper acts as a survey and guide through rotation representations. We walk through their properties that harm or benefit deep learning with gradient-based optimization. By consolidating insights from rotation-based learning, we provide a comprehensive overview of learning functions with rotation representations. We provide guidance on selecting representations based on whether rotations are in the model's input or output and whether the data primarily comprises small angles.
