JGHand: Joint-Driven Animatable Hand Avater via 3D Gaussian Splatting
Zhoutao Sun, Xukun Shen, Yong Hu, Yuyou Zhong, Xueyang Zhou
TL;DR
This work introduces JGHand, a joint-driven, animatable hand avatar built on 3D Gaussian Splatting (3DGS) that achieves real-time, photorealistic rendering across arbitrary poses. A differentiable, zero-error skeleton transformation maps canonical hand Gaussians to any target pose and bone length, enabling accurate, pose-aware deformations via Linear Blend Skinning, while a depth-based shadow layer simulates finger self-occlusion in real time. Identity priors implemented through a trainable triplane feature and pose-aware offsets allow personalized hand appearance without relying on explicit morphable-model parameters. Comprehensive ablations and cross-dataset experiments show improved rendering quality and speed over state-of-the-art methods, with high potential for integration into pose estimation and interactive applications, albeit with texture-completeness requirements for training data.
Abstract
Since hands are the primary interface in daily interactions, modeling high-quality digital human hands and rendering realistic images is a critical research problem. Furthermore, considering the requirements of interactive and rendering applications, it is essential to achieve real-time rendering and driveability of the digital model without compromising rendering quality. Thus, we propose Jointly 3D Gaussian Hand (JGHand), a novel joint-driven 3D Gaussian Splatting (3DGS)-based hand representation that renders high-fidelity hand images in real-time for various poses and characters. Distinct from existing articulated neural rendering techniques, we introduce a differentiable process for spatial transformations based on 3D key points. This process supports deformations from the canonical template to a mesh with arbitrary bone lengths and poses. Additionally, we propose a real-time shadow simulation method based on per-pixel depth to simulate self-occlusion shadows caused by finger movements. Finally, we embed the hand prior and propose an animatable 3DGS representation of the hand driven solely by 3D key points. We validate the effectiveness of each component of our approach through comprehensive ablation studies. Experimental results on public datasets demonstrate that JGHand achieves real-time rendering speeds with enhanced quality, surpassing state-of-the-art methods.
