Human as Points: Explicit Point-based 3D Human Reconstruction from Single-view RGB Images
Yingzhi Tang, Qijian Zhang, Junhui Hou, Yebin Liu
TL;DR
HaP tackles single-view 3D human reconstruction by replacing implicit representations with an explicit, point-based pipeline. It jointly leverages depth maps and rectified SMPL priors, using a conditional diffusion model to generate a complete human point cloud in 3D space, followed by refinement and mesh extraction. The approach introduces an SMPL rectification module and a diffusion-based 3D generator conditioned on depth and SMPL cues, plus a new CityUHuman dataset with detailed scans. Empirical results show 20–40% improvements over state-of-the-art implicit methods and competitive performance against advanced explicit/hybrid techniques, underscoring the practical value of explicit, geometry-centric design for robust, richly detailed 3D human reconstruction.
Abstract
The latest trends in the research field of single-view human reconstruction devote to learning deep implicit functions constrained by explicit body shape priors. Despite the remarkable performance improvements compared with traditional processing pipelines, existing learning approaches still show different aspects of limitations in terms of flexibility, generalizability, robustness, and/or representation capability. To comprehensively address the above issues, in this paper, we investigate an explicit point-based human reconstruction framework called HaP, which adopts point clouds as the intermediate representation of the target geometric structure. Technically, our approach is featured by fully-explicit point cloud estimation, manipulation, generation, and refinement in the 3D geometric space, instead of an implicit learning process that can be ambiguous and less controllable. The overall workflow is carefully organized with dedicated designs of the corresponding specialized learning components as well as processing procedures. Extensive experiments demonstrate that our framework achieves quantitative performance improvements of 20% to 40% over current state-of-the-art methods, and better qualitative results. Our promising results may indicate a paradigm rollback to the fully-explicit and geometry-centric algorithm design, which enables to exploit various powerful point cloud modeling architectures and processing techniques. We will make our code and data publicly available at https://github.com/yztang4/HaP.
