Weakly-Supervised 3D Hand Reconstruction with Knowledge Prior and Uncertainty Guidance
Yufei Zhang, Jeffrey O. Kephart, Qiang Ji
TL;DR
The paper addresses the ill-posed problem of monocular 3D hand reconstruction by proposing a weakly-supervised framework that embeds comprehensive hand knowledge from biomechanics, functional anatomy, and physics as differentiable priors, enabling training with only 2D landmark annotations. It additionally models image observation uncertainty with a heteroscedastic Negative Log-Likelihood loss, improving robustness to occlusion and depth ambiguity. The approach yields substantial improvements over prior weakly-supervised methods, achieving roughly a 21% gain on the FreiHAND dataset, and demonstrates strong data-efficiency and generalization across multiple datasets. This work enhances practical deployment for VR/AR and HCI by reducing dependence on expensive 3D supervision while providing principled uncertainty estimates.
Abstract
Fully-supervised monocular 3D hand reconstruction is often difficult because capturing the requisite 3D data entails deploying specialized equipment in a controlled environment. We introduce a weakly-supervised method that avoids such requirements by leveraging fundamental principles well-established in the understanding of the human hand's unique structure and functionality. Specifically, we systematically study hand knowledge from different sources, including biomechanics, functional anatomy, and physics. We effectively incorporate these valuable foundational insights into 3D hand reconstruction models through an appropriate set of differentiable training losses. This enables training solely with readily-obtainable 2D hand landmark annotations and eliminates the need for expensive 3D supervision. Moreover, we explicitly model the uncertainty that is inherent in image observations. We enhance the training process by exploiting a simple yet effective Negative Log Likelihood (NLL) loss that incorporates uncertainty into the loss function. Through extensive experiments, we demonstrate that our method significantly outperforms state-of-the-art weakly-supervised methods. For example, our method achieves nearly a 21\% performance improvement on the widely adopted FreiHAND dataset.
