On the Continuity of Rotation Representations in Neural Networks
Yi Zhou, Connelly Barnes, Jingwan Lu, Jimei Yang, Hao Li
TL;DR
The paper addresses the problem that common 3D rotation representations (e.g., Euler angles, quaternions) are discontinuous in neural regression, which complicates learning. It defines a formal notion of continuous rotation representation in neural networks and proves that no continuous representation exists for $SO(3)$ in spaces of dimension $\le4$, while constructing continuous representations in $5$ and $6$ dimensions and a general $n$-dimensional approach for $SO(n)$. The main contributions include (i) a precise continuity definition for representations, (ii) two continuous representations yielding $n^2-n$ and $n^2-2n+2$ dimensions (6D and 5D for $SO(3)$), (iii) extensions to related groups ($O(n)$, $Sim(n)$), and (iv) empirical validation showing that 5D/6D representations outperform traditional discontinuous forms across autoencoding, pose estimation, and inverse kinematics tasks. The results demonstrate improved learning efficiency and accuracy, with practical implications for graphics and vision systems that require robust rotation learning and regression.
Abstract
In neural networks, it is often desirable to work with various representations of the same space. For example, 3D rotations can be represented with quaternions or Euler angles. In this paper, we advance a definition of a continuous representation, which can be helpful for training deep neural networks. We relate this to topological concepts such as homeomorphism and embedding. We then investigate what are continuous and discontinuous representations for 2D, 3D, and n-dimensional rotations. We demonstrate that for 3D rotations, all representations are discontinuous in the real Euclidean spaces of four or fewer dimensions. Thus, widely used representations such as quaternions and Euler angles are discontinuous and difficult for neural networks to learn. We show that the 3D rotations have continuous representations in 5D and 6D, which are more suitable for learning. We also present continuous representations for the general case of the n-dimensional rotation group SO(n). While our main focus is on rotations, we also show that our constructions apply to other groups such as the orthogonal group and similarity transforms. We finally present empirical results, which show that our continuous rotation representations outperform discontinuous ones for several practical problems in graphics and vision, including a simple autoencoder sanity test, a rotation estimator for 3D point clouds, and an inverse kinematics solver for 3D human poses.
