Table of Contents
Fetching ...

MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment

Duygu Ceylan, Valentin Deschaintre, Thibault Groueix, Rosalie Martin, Chun-Hao Huang, Romain Rouffet, Vladimir Kim, Gaëtan Lassagne

TL;DR

MatAtlas tackles the problem of producing consistent, relightable textures for 3D meshes guided by text prompts. It uses a grid-pattern diffusion approach with cross-frame attention, depth and contour conditioning, followed by a multi-pass texture refinement to ensure coverage and reduce seams. A material retrieval and assignment stage uses LLM priors and a CLIP/color search to map textures to parametric materials, enabling relighting and editability. Across Objaverse and Google Scanned Objects, MatAtlas outperforms prior art in texture quality and material coherence, with ablations confirming the contribution of each component.

Abstract

We present MatAtlas, a method for consistent text-guided 3D model texturing. Following recent progress we leverage a large scale text-to-image generation model (e.g., Stable Diffusion) as a prior to texture a 3D model. We carefully design an RGB texturing pipeline that leverages a grid pattern diffusion, driven by depth and edges. By proposing a multi-step texture refinement process, we significantly improve the quality and 3D consistency of the texturing output. To further address the problem of baked-in lighting, we move beyond RGB colors and pursue assigning parametric materials to the assets. Given the high-quality initial RGB texture, we propose a novel material retrieval method capitalized on Large Language Models (LLM), enabling editabiliy and relightability. We evaluate our method on a wide variety of geometries and show that our method significantly outperform prior arts. We also analyze the role of each component through a detailed ablation study.

MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment

TL;DR

MatAtlas tackles the problem of producing consistent, relightable textures for 3D meshes guided by text prompts. It uses a grid-pattern diffusion approach with cross-frame attention, depth and contour conditioning, followed by a multi-pass texture refinement to ensure coverage and reduce seams. A material retrieval and assignment stage uses LLM priors and a CLIP/color search to map textures to parametric materials, enabling relighting and editability. Across Objaverse and Google Scanned Objects, MatAtlas outperforms prior art in texture quality and material coherence, with ablations confirming the contribution of each component.

Abstract

We present MatAtlas, a method for consistent text-guided 3D model texturing. Following recent progress we leverage a large scale text-to-image generation model (e.g., Stable Diffusion) as a prior to texture a 3D model. We carefully design an RGB texturing pipeline that leverages a grid pattern diffusion, driven by depth and edges. By proposing a multi-step texture refinement process, we significantly improve the quality and 3D consistency of the texturing output. To further address the problem of baked-in lighting, we move beyond RGB colors and pursue assigning parametric materials to the assets. Given the high-quality initial RGB texture, we propose a novel material retrieval method capitalized on Large Language Models (LLM), enabling editabiliy and relightability. We evaluate our method on a wide variety of geometries and show that our method significantly outperform prior arts. We also analyze the role of each component through a detailed ablation study.
Paper Structure (13 sections, 11 figures, 1 table)

This paper contains 13 sections, 11 figures, 1 table.

Figures (11)

  • Figure 1: We present MatAtlas, a novel approach for generation text guided relightable textures for a given 3D mesh. Our method first generates an RGB texture using pre-trained large scale image generation models. Using this texture as a guidance, we retrieve materials from a database and assign to different parts of the mesh resulting in a plausible and fully relightable asset.
  • Figure 2: Conditional generation. We utilize both depth and lineart based conditioning to guide the image generation model. While depth helps to preserve the underlying geometry, occluded and suggested contours represented in the line renderings help to capture details and avoid texture bleeding across different parts.
  • Figure 3: Multi-pass texture generation. We perform a multi-pass texture generation where an initial pass enables the generation of a globally consistent but potentially blurry texture. In the second pass, we refine this texture quality. We perform a final inpainting pass to ensure a full coverage of any untextured parts of the model. We use lineart and depth cues to condition the image generation model.
  • Figure 4: Material retrieval. (a) Given a textured and segmented object, we rely on large language models to extract the global context and suggest different material types for each part. (b) We then perform a visual material search for each part within the corresponding material category. This search utilizes CLIP image embeddings as well as color features. (c) After assinging the retrieved materials to the mesh, we can relight the asset. Here we render it with Blender blender under a soft environment illumination.
  • Figure 5: We compare our method to Text2Tex text2tex and TEXTure TEXTure, two recent state-of-the-art generative texturing methods that also utilize Stable Diffusion. We show the input meshes with the text prompts in the first row. Our results better preserve texture consistency.
  • ...and 6 more figures