S2TD-Face: Reconstruct a Detailed 3D Face with Controllable Texture from a Single Sketch
Zidu Wang, Xiangyu Zhu, Jiang Yu, Tianshuo Zhang, Zhen Lei
TL;DR
S2TD-Face tackles sketch-to-3D-face reconstruction by introducing a two-stage geometry pipeline that first predicts coarse 3DMM-based geometry and then refines details with a UV-space displacement map, guided by a novel sketch-to-geometry loss that preserves delicate sketch features. A texture-control module leverages CLIP-based text-image matching to select textures from a library and fuse them onto the UV-mapped geometry, using PCA albedo to fill occluded regions. The framework operates without 3D scans, using 2D supervisory signals (landmarks, segmentation) and differentiable rendering to supervise both coarse and fine geometry across diverse sketch styles. Extensive experiments on the Sketch-REALY benchmark show state-of-the-art geometry accuracy and high-quality, controllable textures, with practical applications in avatars, animation, and missing-person search. The approach highlights the importance of aligning geometry directly with sketch features while enabling expressive texture variation via natural language prompts.
Abstract
3D textured face reconstruction from sketches applicable in many scenarios such as animation, 3D avatars, artistic design, missing people search, etc., is a highly promising but underdeveloped research topic. On the one hand, the stylistic diversity of sketches leads to existing sketch-to-3D-face methods only being able to handle pose-limited and realistically shaded sketches. On the other hand, texture plays a vital role in representing facial appearance, yet sketches lack this information, necessitating additional texture control in the reconstruction process. This paper proposes a novel method for reconstructing controllable textured and detailed 3D faces from sketches, named S2TD-Face. S2TD-Face introduces a two-stage geometry reconstruction framework that directly reconstructs detailed geometry from the input sketch. To keep geometry consistent with the delicate strokes of the sketch, we propose a novel sketch-to-geometry loss that ensures the reconstruction accurately fits the input features like dimples and wrinkles. Our training strategies do not rely on hard-to-obtain 3D face scanning data or labor-intensive hand-drawn sketches. Furthermore, S2TD-Face introduces a texture control module utilizing text prompts to select the most suitable textures from a library and seamlessly integrate them into the geometry, resulting in a 3D detailed face with controllable texture. S2TD-Face surpasses existing state-of-the-art methods in extensive quantitative and qualitative experiments. Our project is available at https://github.com/wang-zidu/S2TD-Face .
