HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting
Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, Ziwei Liu
TL;DR
HumanGaussian addresses the challenge of text-driven 3D human generation by integrating 3D Gaussian Splatting with structure-aware guidance. The approach initializes Gaussians on an SMPL-X surface and uses a dual-branch diffusion model to jointly learn texture and structure, augmented by an annealed negative prompt strategy to avoid over-saturation. A prune-only phase further removes artifacts, yielding efficient yet high-quality geometry and appearance. Across qualitative and user-study evaluations, the method demonstrates competitive visual fidelity and improved efficiency relative to existing text-to-3D human baselines, paving the way for scalable, controllable 3D human generation from text prompts.
Abstract
Realistic 3D human generation from text prompts is a desirable yet challenging task. Existing methods optimize 3D representations like mesh or neural fields via score distillation sampling (SDS), which suffers from inadequate fine details or excessive training time. In this paper, we propose an efficient yet effective framework, HumanGaussian, that generates high-quality 3D humans with fine-grained geometry and realistic appearance. Our key insight is that 3D Gaussian Splatting is an efficient renderer with periodic Gaussian shrinkage or growing, where such adaptive density control can be naturally guided by intrinsic human structures. Specifically, 1) we first propose a Structure-Aware SDS that simultaneously optimizes human appearance and geometry. The multi-modal score function from both RGB and depth space is leveraged to distill the Gaussian densification and pruning process. 2) Moreover, we devise an Annealed Negative Prompt Guidance by decomposing SDS into a noisier generative score and a cleaner classifier score, which well addresses the over-saturation issue. The floating artifacts are further eliminated based on Gaussian size in a prune-only phase to enhance generation smoothness. Extensive experiments demonstrate the superior efficiency and competitive quality of our framework, rendering vivid 3D humans under diverse scenarios. Project Page: https://alvinliu0.github.io/projects/HumanGaussian
