Table of Contents
Fetching ...

GaussianBody: Clothed Human Reconstruction via 3d Gaussian Splatting

Mengtian Li, Shengxiang Yao, Zhifeng Xie, Keyu Chen

TL;DR

GaussianBody adapts 3D Gaussian Splatting to dynamic clothed humans by introducing pose-guided deformation and a physically-based prior to stabilize optimization. A canonical-to-observation-space framework, split-with-scale density enhancement, and pose refinement enable fast training (~1h on a single RTX 4090) and high-fidelity novel-view rendering with explicit geometry. Experiments on PeopleSnapshot and iPER show state-of-the-art performance against baselines, with robust geometry recovery and detailed cloth textures, validated by ablations. Limitations include deformation MLP pitfalls and challenges in novel pose synthesis, suggesting future work in balanced non-rigid cloth modeling and improved pose handling.

Abstract

In this work, we propose a novel clothed human reconstruction method called GaussianBody, based on 3D Gaussian Splatting. Compared with the costly neural radiance based models, 3D Gaussian Splatting has recently demonstrated great performance in terms of training time and rendering quality. However, applying the static 3D Gaussian Splatting model to the dynamic human reconstruction problem is non-trivial due to complicated non-rigid deformations and rich cloth details. To address these challenges, our method considers explicit pose-guided deformation to associate dynamic Gaussians across the canonical space and the observation space, introducing a physically-based prior with regularized transformations helps mitigate ambiguity between the two spaces. During the training process, we further propose a pose refinement strategy to update the pose regression for compensating the inaccurate initial estimation and a split-with-scale mechanism to enhance the density of regressed point clouds. The experiments validate that our method can achieve state-of-the-art photorealistic novel-view rendering results with high-quality details for dynamic clothed human bodies, along with explicit geometry reconstruction.

GaussianBody: Clothed Human Reconstruction via 3d Gaussian Splatting

TL;DR

GaussianBody adapts 3D Gaussian Splatting to dynamic clothed humans by introducing pose-guided deformation and a physically-based prior to stabilize optimization. A canonical-to-observation-space framework, split-with-scale density enhancement, and pose refinement enable fast training (~1h on a single RTX 4090) and high-fidelity novel-view rendering with explicit geometry. Experiments on PeopleSnapshot and iPER show state-of-the-art performance against baselines, with robust geometry recovery and detailed cloth textures, validated by ablations. Limitations include deformation MLP pitfalls and challenges in novel pose synthesis, suggesting future work in balanced non-rigid cloth modeling and improved pose handling.

Abstract

In this work, we propose a novel clothed human reconstruction method called GaussianBody, based on 3D Gaussian Splatting. Compared with the costly neural radiance based models, 3D Gaussian Splatting has recently demonstrated great performance in terms of training time and rendering quality. However, applying the static 3D Gaussian Splatting model to the dynamic human reconstruction problem is non-trivial due to complicated non-rigid deformations and rich cloth details. To address these challenges, our method considers explicit pose-guided deformation to associate dynamic Gaussians across the canonical space and the observation space, introducing a physically-based prior with regularized transformations helps mitigate ambiguity between the two spaces. During the training process, we further propose a pose refinement strategy to update the pose regression for compensating the inaccurate initial estimation and a split-with-scale mechanism to enhance the density of regressed point clouds. The experiments validate that our method can achieve state-of-the-art photorealistic novel-view rendering results with high-quality details for dynamic clothed human bodies, along with explicit geometry reconstruction.
Paper Structure (21 sections, 12 equations, 9 figures, 1 table)

This paper contains 21 sections, 12 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: GaussianBody takes monocular RGB video as input, reconstructing a clothed human model from 1080$\times$1080 images in around 1 hour on a single 4090 GPU. The resulting human model serves as a tool for simulating human performance in novel views. Furthermore, we offer the point cloud as a mechanism for deformation control.
  • Figure 2: Overview of our pipeline. We initialize the point cloud using SMPL vertices, deforming the position and rotation parameters of Gaussians through SMPL forward linear blend skinning (LBS) to transform them into the observation space. The canonical model is then optimized, taking into account the physically-based prior $\mathcal{L}_{rigid}, \mathcal{L}_{rot}, \mathcal{L}_{iso}$. To address image blurriness, we optimize the pose parameters. The output includes both the point cloud and the appearance of the reconstructed human.
  • Figure 3: Local-rigidity loss. With the Gaussians $i$ rotating between the two spaces, the neighbour Gaussians $j$ should move to follow the rigid-transform in the coordinate system of Gaussians $i$
  • Figure 4: Results of novel view synthesis and point cloud on PeopleSnapshot alldieck2018video dataset. Our method effectively restores details on the human body, including intricate details in the hair and folds on the clothes. Moreover, the generated point cloud faithfully captures geometric details on the clothing, demonstrating a commendable separation between geometry and texture.
  • Figure 5: Visual comparison of different methods about novel view synthesis on PeopleSnapshotalldieck2018video(column 1&2) and iPERliu2019liquid(column 3&4). 3D-GSkerbl20233d rely on multi-view consistency to gain center subjection which failed to handle dynamic scenes. InstantAvatarjiang2023instantavatar trade the quality and robust for time, might gain blur result on inaccurate parameters. Our method gains high-fidelity results, especially on the cloth texture and the robustness.
  • ...and 4 more figures