Table of Contents
Fetching ...

NaTex: Seamless Texture Generation as Latent Color Diffusion

Zeqiang Lai, Yunfei Zhao, Zibo Zhao, Xin Yang, Xin Huang, Jingwei Huang, Xiangyu Yue, Chunchao Guo

TL;DR

NaTex reframes texture generation as a native 3D problem by modeling textures as a dense color field in 3D space and using a latent diffusion pipeline. The approach couples a geometry-aware color VAE with a multi-control color DiT to enable geometry-conditioned, occlusion-aware, and highly coherent texture synthesis directly on surfaces, avoiding 2D lifting artifacts. Key contributions include the geometry-guided VAE, a flexible multi-control DiT, and diverse applications such as material generation and part texturing, with strong zero-shot and few-shot generalization. Empirically, NaTex delivers superior texture coherence and boundary alignment compared to MVD-based methods and demonstrates versatile downstream utility across 3D texture tasks.

Abstract

We present NaTex, a native texture generation framework that predicts texture color directly in 3D space. In contrast to previous approaches that rely on baking 2D multi-view images synthesized by geometry-conditioned Multi-View Diffusion models (MVDs), NaTex avoids several inherent limitations of the MVD pipeline. These include difficulties in handling occluded regions that require inpainting, achieving precise mesh-texture alignment along boundaries, and maintaining cross-view consistency and coherence in both content and color intensity. NaTex features a novel paradigm that addresses the aforementioned issues by viewing texture as a dense color point cloud. Driven by this idea, we propose latent color diffusion, which comprises a geometry-awared color point cloud VAE and a multi-control diffusion transformer (DiT), entirely trained from scratch using 3D data, for texture reconstruction and generation. To enable precise alignment, we introduce native geometry control that conditions the DiT on direct 3D spatial information via positional embeddings and geometry latents. We co-design the VAE-DiT architecture, where the geometry latents are extracted via a dedicated geometry branch tightly coupled with the color VAE, providing fine-grained surface guidance that maintains strong correspondence with the texture. With these designs, NaTex demonstrates strong performance, significantly outperforming previous methods in texture coherence and alignment. Moreover, NaTex also exhibits strong generalization capabilities, either training-free or with simple tuning, for various downstream applications, e.g., material generation, texture refinement, and part segmentation and texturing.

NaTex: Seamless Texture Generation as Latent Color Diffusion

TL;DR

NaTex reframes texture generation as a native 3D problem by modeling textures as a dense color field in 3D space and using a latent diffusion pipeline. The approach couples a geometry-aware color VAE with a multi-control color DiT to enable geometry-conditioned, occlusion-aware, and highly coherent texture synthesis directly on surfaces, avoiding 2D lifting artifacts. Key contributions include the geometry-guided VAE, a flexible multi-control DiT, and diverse applications such as material generation and part texturing, with strong zero-shot and few-shot generalization. Empirically, NaTex delivers superior texture coherence and boundary alignment compared to MVD-based methods and demonstrates versatile downstream utility across 3D texture tasks.

Abstract

We present NaTex, a native texture generation framework that predicts texture color directly in 3D space. In contrast to previous approaches that rely on baking 2D multi-view images synthesized by geometry-conditioned Multi-View Diffusion models (MVDs), NaTex avoids several inherent limitations of the MVD pipeline. These include difficulties in handling occluded regions that require inpainting, achieving precise mesh-texture alignment along boundaries, and maintaining cross-view consistency and coherence in both content and color intensity. NaTex features a novel paradigm that addresses the aforementioned issues by viewing texture as a dense color point cloud. Driven by this idea, we propose latent color diffusion, which comprises a geometry-awared color point cloud VAE and a multi-control diffusion transformer (DiT), entirely trained from scratch using 3D data, for texture reconstruction and generation. To enable precise alignment, we introduce native geometry control that conditions the DiT on direct 3D spatial information via positional embeddings and geometry latents. We co-design the VAE-DiT architecture, where the geometry latents are extracted via a dedicated geometry branch tightly coupled with the color VAE, providing fine-grained surface guidance that maintains strong correspondence with the texture. With these designs, NaTex demonstrates strong performance, significantly outperforming previous methods in texture coherence and alignment. Moreover, NaTex also exhibits strong generalization capabilities, either training-free or with simple tuning, for various downstream applications, e.g., material generation, texture refinement, and part segmentation and texturing.

Paper Structure

This paper contains 15 sections, 3 equations, 18 figures, 2 tables.

Figures (18)

  • Figure 1: High-quality textured 3D assets generated by NaTex from a single image (Geometry from Hunyuan3D 2.5 lai2025hunyuan3d.)
  • Figure 2: Illustration of the fundamental challenges in multi-view diffusion (MVD) texturing, compared with the proposed NaTex.
  • Figure 3: Overall architecture of NaTex: it mainly consists of a geometry-aware color VAE for reconstruction and a multi-control color DiT for generation, adaptable for diverse applications. Left-most assets are all generated by NaTex.
  • Figure 4: Illustration of multi-control mechanisms of the proposed color DiT. Color control is useful for texture-conditioned tasks.
  • Figure 5: Visual results showcasing representative applications of NaTex. Additional results are provided in the Appendix.
  • ...and 13 more figures