Full-range Head Pose Geometric Data Augmentations
Huei-Chung Hu, Xuyang Wu, Haowei Liu, Ting-Ruen Wei, Hsin-Tai Wu
TL;DR
This work tackles the core issue in head pose estimation—ambiguous coordinate systems and Euler-angle handling that hinder full-range pose coverage. It develops a rigorous mathematical framework to identify and align coordinate systems, axis sequences, and rotation definitions, then derives 2D geometric augmentation formulas and correct drawing routines that operate on rotation matrices rather than solely on Euler angles. By anchoring 300W-LP within a coherent left-handed intrinsic XYZ framework and providing stable Euler-angle extraction and conversion methods, the authors enable reliable full-range data augmentation and synthetic pose generation (via Blender and Panohead). Empirically, these augmentations improve HPE performance and enable the creation of diverse, full-range pose datasets (e.g., CMU_HPE_10K) while maintaining compatibility with existing architectures like 6D-RepNet+. The practical impact is a more robust, scalable approach to full-range HPE data generation and model training.
Abstract
Many head pose estimation (HPE) methods promise the ability to create full-range datasets, theoretically allowing the estimation of the rotation and positioning of the head from various angles. However, these methods are only accurate within a range of head angles; exceeding this specific range led to significant inaccuracies. This is dominantly explained by unclear specificity of the coordinate systems and Euler Angles used in the foundational rotation matrix calculations. Here, we addressed these limitations by presenting (1) methods that accurately infer the correct coordinate system and Euler angles in the correct axis-sequence, (2) novel formulae for 2D geometric augmentations of the rotation matrices under the (SPECIFIC) coordinate system, (3) derivations for the correct drawing routines for rotation matrices and poses, and (4) mathematical experimentation and verification that allow proper pitch-yaw coverage for full-range head pose dataset generation. Performing our augmentation techniques to existing head pose estimation methods demonstrated a significant improvement to the model performance. Code will be released upon paper acceptance.
