Table of Contents
Fetching ...

HandSCS: Structural Coordinate Space for Animatable Hand Gaussian Splatting

Yilan Dong, Wenqing Wang, Qing Wang, Jiahao Yang, Haohe Liu, Xiatuan Zhu, Gregory Slabaugh, Shanxin Yuan

TL;DR

HandSCS tackles animatable hand avatars from multi-view images by introducing a structure-guided Gaussian Splatting framework. It leverages a Structural Coordinate Space (SCS) to provide intra-pose structural cues through a hybrid static-dynamic bone basis and angular-radial descriptors, and an Inter-Pose Consistency Loss to enforce cross-pose coherence. Per-Gaussian embeddings and pose-aware offsets enable accurate geometry and appearance under strong deformations, with a lightweight non-rigid deformation pipeline. On InterHand2.6M, HandSCS achieves state-of-the-art performance for novel-pose animation and novel-view synthesis while maintaining real-time rendering speed.

Abstract

Creating animatable hand avatars from multi-view images requires modeling complex articulations and maintaining structural consistency across poses in real time. We present HandSCS, a structure-guided 3D Gaussian Splatting framework for high-fidelity hand animation. Unlike existing approaches that condition all Gaussians on the same global pose parameters, which are inadequate for highly articulated hands, HandSCS equips each Gaussian with explicit structural guidance from both intra-pose and inter-pose perspectives. To establish intra-pose structural guidance, we introduce a Structural Coordinate Space (SCS), which bridges the gap between sparse bones and dense Gaussians through hybrid static-dynamic coordinate basis and angular-radial descriptors. To improve cross-pose coherence, we further introduce an Inter-pose Consistency Loss that promotes consistent Gaussian attributes under similar articulations. Together, these components achieve high-fidelity results with consistent fine details, even in challenging high-deformation and self-contact regions. Experiments on the InterHand2.6M dataset demonstrate that HandSCS achieves state-of-the-art performance in hand avatar animation, confirming the effectiveness of explicit structural modeling.

HandSCS: Structural Coordinate Space for Animatable Hand Gaussian Splatting

TL;DR

HandSCS tackles animatable hand avatars from multi-view images by introducing a structure-guided Gaussian Splatting framework. It leverages a Structural Coordinate Space (SCS) to provide intra-pose structural cues through a hybrid static-dynamic bone basis and angular-radial descriptors, and an Inter-Pose Consistency Loss to enforce cross-pose coherence. Per-Gaussian embeddings and pose-aware offsets enable accurate geometry and appearance under strong deformations, with a lightweight non-rigid deformation pipeline. On InterHand2.6M, HandSCS achieves state-of-the-art performance for novel-pose animation and novel-view synthesis while maintaining real-time rendering speed.

Abstract

Creating animatable hand avatars from multi-view images requires modeling complex articulations and maintaining structural consistency across poses in real time. We present HandSCS, a structure-guided 3D Gaussian Splatting framework for high-fidelity hand animation. Unlike existing approaches that condition all Gaussians on the same global pose parameters, which are inadequate for highly articulated hands, HandSCS equips each Gaussian with explicit structural guidance from both intra-pose and inter-pose perspectives. To establish intra-pose structural guidance, we introduce a Structural Coordinate Space (SCS), which bridges the gap between sparse bones and dense Gaussians through hybrid static-dynamic coordinate basis and angular-radial descriptors. To improve cross-pose coherence, we further introduce an Inter-pose Consistency Loss that promotes consistent Gaussian attributes under similar articulations. Together, these components achieve high-fidelity results with consistent fine details, even in challenging high-deformation and self-contact regions. Experiments on the InterHand2.6M dataset demonstrate that HandSCS achieves state-of-the-art performance in hand avatar animation, confirming the effectiveness of explicit structural modeling.

Paper Structure

This paper contains 19 sections, 15 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: We introduce HandSCS, a structure-guided 3D Gaussian Splatting framework that provides each Gaussian with structural cues from both intra-pose and inter-pose perspectives. Intra-pose guidance is delivered through a Structural Coordinate Space that bridges sparse skeletal joints and dense Gaussians, while inter-pose guidance encourages attribute consistency across similar articulations. This formulation preserves clear boundaries and fine details, enabling high-fidelity animation even under strong deformations or self-contact.
  • Figure 2: Overview of HandSCS. Each 3D Gaussian is modeled in canonical space. At the intra-pose level, we extract the LBS transformation as motion embedding $\mathcal{B}$, and rigidly posed position $x'_{\textit{lbs}}$, which is combined with skeleton joints to compute the bone-relative positional embedding $\mathcal{P}$. Together with disentangled geometry $e_g$ and appearance $e_a$ embeddings, these features are used to predict adaptive attribute offsets. At the inter-pose level, a structure-consistency loss encourages consistent attributes across poses. At the point level, a neighborhood-aware densification strategy improves point distribution in under-reconstructed regions.
  • Figure 3: Illustration of Structural Coordinate Space (SCS). The left part shows the extended kinematic topology $\mathcal{E}$ based on MANO model, with added cross-finger pseudo-bones and removal of redundant connections. The right part illustrates how each Gaussian is encoded with its cosine of angle $\rho_{u,v}$ and distance $d_{u,v}$ relative to each bone in the posed space.
  • Figure 4: Qualitative Comparison of Novel Pose Synthesis. We compare results on InterHand2.6M interhand across three subjects using HandAvatar handavatar, LiveHand livehand, GauHuman gauhuman, and our method. HandSCS produces sharper structures, cleaner boundaries, and fewer artifacts under challenging articulations.
  • Figure 5: Qualitative comparison of novel view synthesis.
  • ...and 2 more figures