Multi-Rigid-Body Approximation of Human Hands with Application to Digital Twin
Bin Zhao, Yiwen Lu, Haohua Zhu, Xiao Li, Sheng Yi
TL;DR
<3-5 sentence high-level summary> The paper tackles the challenge of real-time, visually faithful hand simulation for digital twins by converting a personalized MANO hand model into a multi-rigid-body URDF representation. It introduces a mathematically grounded projection framework that maps unconstrained SO(3) joint rotations to kinematically constrained joints, using closed-form solutions for single-DOF joints and BCH-corrected iterations for two-DOF joints. The authors present a full pipeline from motion capture to rigid-body hand simulation, including automated mesh segmentation and anatomically informed axis determination. Experiments show sub-centimeter tracking error and 1000+ Hz simulation, enabling accurate replay of human demonstrations with RL policies across diverse manipulation tasks.</paper_summary>
Abstract
Human hand simulation plays a critical role in digital twin applications, requiring models that balance anatomical fidelity with computational efficiency. We present a complete pipeline for constructing multi-rigid-body approximations of human hands that preserve realistic appearance while enabling real-time physics simulation. Starting from optical motion capture of a specific human hand, we construct a personalized MANO (Multi-Abstracted hand model with Neural Operations) model and convert it to a URDF (Unified Robot Description Format) representation with anatomically consistent joint axes. The key technical challenge is projecting MANO's unconstrained SO(3) joint rotations onto the kinematically constrained joints of the rigid-body model. We derive closed-form solutions for single degree-of-freedom joints and introduce a Baker-Campbell-Hausdorff (BCH)-corrected iterative method for two degree-of-freedom joints that properly handles the non-commutativity of rotations. We validate our approach through digital twin experiments where reinforcement learning policies control the multi-rigid-body hand to replay captured human demonstrations. Quantitative evaluation shows sub-centimeter reconstruction error and successful grasp execution across diverse manipulation tasks.
