CTGAN: Semantic-guided Conditional Texture Generator for 3D Shapes
Yi-Ting Pan, Chai-Rong Lee, Shu-Ho Fan, Jheng-Wei Su, Jia-Bin Huang, Yung-Yu Chuang, Hung-Kuo Chu
TL;DR
CTGAN addresses the challenge of producing high-fidelity, view-consistent textures for 3D shapes by conditioning texture generation on semantic segmentation maps and reference style images. It leverages StyleGAN2-ADA with a disentangled latent space split into structure and style components, learned via two encoders, and enforces semantic alignment through a coarse-to-fine structure encoder. A canonical-view texture atlas parameterizes textures across multiple views, and a three-stage training regime with L2, LPIPS, and MOCO losses yields state-of-the-art results on ShapeNet cars and FFHQ faces in both conditional and unconditional settings. The work enables controllable, semantically guided texture synthesis for 3D objects, with practical impact for rapid, consistent texturing in games, films, and AR/VR pipelines, while noting limitations in handling occlusion and seams at view boundaries.
Abstract
The entertainment industry relies on 3D visual content to create immersive experiences, but traditional methods for creating textured 3D models can be time-consuming and subjective. Generative networks such as StyleGAN have advanced image synthesis, but generating 3D objects with high-fidelity textures is still not well explored, and existing methods have limitations. We propose the Semantic-guided Conditional Texture Generator (CTGAN), producing high-quality textures for 3D shapes that are consistent with the viewing angle while respecting shape semantics. CTGAN utilizes the disentangled nature of StyleGAN to finely manipulate the input latent codes, enabling explicit control over both the style and structure of the generated textures. A coarse-to-fine encoder architecture is introduced to enhance control over the structure of the resulting textures via input segmentation. Experimental results show that CTGAN outperforms existing methods on multiple quality metrics and achieves state-of-the-art performance on texture generation in both conditional and unconditional settings.
