Divide and Fuse: Body Part Mesh Recovery from Partially Visible Human Images
Tianyu Luan, Zhongpai Gao, Luyuan Xie, Abhishek Sharma, Hao Ding, Benjamin Planche, Meng Zheng, Ange Lou, Terrence Chen, Junsong Yuan, Ziyan Wu
TL;DR
This work tackles reconstructing 3D human body meshes from monocular images with substantial occlusion, where traditional top-down SMPL-based methods struggle. It introduces Divide and Fuse (D&F), a bottom-up framework built on Human Part Parametric Models ($ ext{HPPM}$) that splits the body into $15$ parts and reconstructs each part independently from a few shape and global transformation parameters, using a Swin Transformer backbone. A fusion module with overlapping regions and self-supervised losses ($\mathcal{L}_{ol}$, $\mathcal{L}_{dc}$) then seamlessly connects adjacent parts to form a coherent mesh, even when only a subset of parts is visible. The authors provide two partially visible benchmarks, PV-Human3.6M and PV-3DPW, and demonstrate that D&F yields superior mesh and joint accuracy compared to state-of-the-art methods under heavy occlusion, with ablations confirming the importance of part-wise supervision, overlapping handling, and gradual fusion. Overall, D&F offers a robust, modular approach to partially visible human reconstruction that improves reliability in occluded scenarios and motivates further extensions to richer body models and automatic part detection.
Abstract
We introduce a novel bottom-up approach for human body mesh reconstruction, specifically designed to address the challenges posed by partial visibility and occlusion in input images. Traditional top-down methods, relying on whole-body parametric models like SMPL, falter when only a small part of the human is visible, as they require visibility of most of the human body for accurate mesh reconstruction. To overcome this limitation, our method employs a "Divide and Fuse (D&F)" strategy, reconstructing human body parts independently before fusing them, thereby ensuring robustness against occlusions. We design Human Part Parametric Models (HPPM) that independently reconstruct the mesh from a few shape and global-location parameters, without inter-part dependency. A specially designed fusion module then seamlessly integrates the reconstructed parts, even when only a few are visible. We harness a large volume of ground-truth SMPL data to train our parametric mesh models. To facilitate the training and evaluation of our method, we have established benchmark datasets featuring images of partially visible humans with HPPM annotations. Our experiments, conducted on these benchmark datasets, demonstrate the effectiveness of our D&F method, particularly in scenarios with substantial invisibility, where traditional approaches struggle to maintain reconstruction quality.
