TEXTRIX: Latent Attribute Grid for Native Texture Generation and Beyond
Yifei Zeng, Yajie Bao, Jiachen Qian, Shuang Wu, Youtian Lin, Hao Zhu, Buyu Li, Feihu Zhang, Xun Cao, Yao Yao
TL;DR
<3-5 sentence high-level summary> TEXTRIX introduces a native 3D texture generation framework that operates directly in a sparse latent 3D attribute grid, bypassing multi-view fusion and UV-space seams. It combines a Sparse VAE with a Diffusion Transformer equipped with sparse latent conditioning to generate high-fidelity textures and to perform precise 3D segmentation, all within a unified native representation. The approach demonstrates state-of-the-art performance on texture generation and 3D part segmentation across complex meshes and supports extensibility to PBR materials. Ablation studies confirm the critical role of the sparse conditioning and the rendering-based training objectives in achieving high fidelity and coherent cross-view results.
Abstract
Prevailing 3D texture generation methods, which often rely on multi-view fusion, are frequently hindered by inter-view inconsistencies and incomplete coverage of complex surfaces, limiting the fidelity and completeness of the generated content. To overcome these challenges, we introduce TEXTRIX, a native 3D attribute generation framework for high-fidelity texture synthesis and downstream applications such as precise 3D part segmentation. Our approach constructs a latent 3D attribute grid and leverages a Diffusion Transformer equipped with sparse attention, enabling direct coloring of 3D models in volumetric space and fundamentally avoiding the limitations of multi-view fusion. Built upon this native representation, the framework naturally extends to high-precision 3D segmentation by training the same architecture to predict semantic attributes on the grid. Extensive experiments demonstrate state-of-the-art performance on both tasks, producing seamless, high-fidelity textures and accurate 3D part segmentation with precise boundaries.
