D$^3$-Human: Dynamic Disentangled Digital Human from Monocular Video
Honghu Chen, Bo Peng, Yunfan Tao, Juyong Zhang
TL;DR
D^3-Human tackles the challenge of reconstructing decoupled clothing and underlying body from monocular video by marrying explicit and implicit representations. It introduces the human manifold signed distance field ($hmSDF$) to segment visible clothing and body on the clothed surface, while relying on SMPL-based completion for invisible regions, and uses dual non-rigid deformation fields to model clothing and body separately. A region-aggregation step fixes segmentation holes due to parsing noise, and occlusion-aware differentiable rendering ensures consistent 2D supervision for both layers. The method achieves fast template generation, enables high-fidelity decoupled geometry, and supports applications like clothing transfer and physics-based animation, advancing editable digital avatars from a single camera.
Abstract
We introduce D$^3$-Human, a method for reconstructing Dynamic Disentangled Digital Human geometry from monocular videos. Past monocular video human reconstruction primarily focuses on reconstructing undecoupled clothed human bodies or only reconstructing clothing, making it difficult to apply directly in applications such as animation production. The challenge in reconstructing decoupled clothing and body lies in the occlusion caused by clothing over the body. To this end, the details of the visible area and the plausibility of the invisible area must be ensured during the reconstruction process. Our proposed method combines explicit and implicit representations to model the decoupled clothed human body, leveraging the robustness of explicit representations and the flexibility of implicit representations. Specifically, we reconstruct the visible region as SDF and propose a novel human manifold signed distance field (hmSDF) to segment the visible clothing and visible body, and then merge the visible and invisible body. Extensive experimental results demonstrate that, compared with existing reconstruction schemes, D$^3$-Human can achieve high-quality decoupled reconstruction of the human body wearing different clothing, and can be directly applied to clothing transfer and animation.
