SARS: A Novel Face and Body Shape and Appearance Aware 3D Reconstruction System extends Morphable Models
Gulraiz Khan, Kenneth Y. Wertheim, Kevin Pimbblet, Waqas Ahmed
TL;DR
This work tackles monocular 3D human reconstruction with identity- and expression-aware facial details, proposing SARS—a modular system comprising a face module that leverages semantic priors and a 3DMM-based face representation, a body module built on SPIN/SMPL, and a fusion integration module to produce a coherent full-body mesh. The face module fuses a latent encoding of displacement maps and signed distance fields with high-level semantics (age, gender, landmarks) via a StyleGAN2-based decoder, while the body module refines pose and shape through SMPLify and SMPL. The approach enables end-to-end, modular inference of face and body, offering boundary stitching, attention-based refinement, and a multi-task discriminator to enforce semantic consistency, achieving state-of-the-art or competitive results on MICC Florence 3D, 3DPW, and EHF datasets. The work advances practical 3D avatar creation for AR/VR and fashion by delivering high-fidelity, identity-preserving full-body reconstructions from single images, and it points to future enhancements in semantic control, volumetric body modeling, and real-time performance.
Abstract
Morphable Models (3DMMs) are a type of morphable model that takes 2D images as inputs and recreates the structure and physical appearance of 3D objects, especially human faces and bodies. 3DMM combines identity and expression blendshapes with a basic face mesh to create a detailed 3D model. The variability in the 3D Morphable models can be controlled by tuning diverse parameters. They are high-level image descriptors, such as shape, texture, illumination, and camera parameters. Previous research in 3D human reconstruction concentrated solely on global face structure or geometry, ignoring face semantic features such as age, gender, and facial landmarks characterizing facial boundaries, curves, dips, and wrinkles. In order to accommodate changes in these high-level facial characteristics, this work introduces a shape and appearance-aware 3D reconstruction system (named SARS by us), a c modular pipeline that extracts body and face information from a single image to properly rebuild the 3D model of the human full body.
