Table of Contents
Fetching ...

MV2UV: Generating High-quality UV Texture Maps with Multiview Prompts

Zheng Zhang, Qinchuan Zhang, Yuteng Ye, Zhi Chen, Penglei Ji, Mengfei Li, Wenxiao Zhang, Yuan Liu

Abstract

Generating high-quality textures for 3D assets is a challenging task. Existing multiview texture generation methods suffer from the multiview inconsistency and missing textures on unseen parts, while UV inpainting texture methods do not generalize well due to insufficient UV data and cannot well utilize 2D image diffusion priors. In this paper, we propose a new method called MV2UV that combines 2D generative priors from multiview generation and the inpainting ability of UV refinement to get high-quality texture maps. Our key idea is to adopt a UV space generative model that simultaneously inpaints unseen parts of multiview images while resolving the inconsistency of multiview images. Experiments show that our method enables a better texture generation quality than existing methods, especially in unseen occluded and multiview-inconsistent parts.

MV2UV: Generating High-quality UV Texture Maps with Multiview Prompts

Abstract

Generating high-quality textures for 3D assets is a challenging task. Existing multiview texture generation methods suffer from the multiview inconsistency and missing textures on unseen parts, while UV inpainting texture methods do not generalize well due to insufficient UV data and cannot well utilize 2D image diffusion priors. In this paper, we propose a new method called MV2UV that combines 2D generative priors from multiview generation and the inpainting ability of UV refinement to get high-quality texture maps. Our key idea is to adopt a UV space generative model that simultaneously inpaints unseen parts of multiview images while resolving the inconsistency of multiview images. Experiments show that our method enables a better texture generation quality than existing methods, especially in unseen occluded and multiview-inconsistent parts.
Paper Structure (19 sections, 4 equations, 10 figures, 4 tables)

This paper contains 19 sections, 4 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: The Overview of our MV2UV framework. We first generate multiview images (MVs) using multiview diffusion (MVD). Then, we treat the generated MVs as semantic prompts to guide texture generation on the UV map. The generation process is based on a diffusion U-Net, which primarily consists of Self Attention, UV Self Attention, and Reference Attention modules.
  • Figure 2: We apply the 3D coordinates as the positional encoding within MV-UV cross attention, enabling the UV map to attend to geometrically corresponding regions on multiview images for improved inpainting and inconsistency resolution.
  • Figure 3: The geometric position encoding. We first render the object mesh to obtain normal and position maps, which are processed in two separate branches: one in UV space and the other in view space. In each branch, the corresponding geometric maps are transformed into positional embeddings and further encoded by a learnable position encoder to produce geometric features.
  • Figure 4: The UV Islands. UV self attention enables the model to establish correlations between UV regions (e.g., $P_A$ and $P_B$) that are adjacent in 3D space, even when they are disconnected in the UV layout.
  • Figure 5: Comparisons with texture generation methods from a single image input. Our method enables automatic completion of textures in occluded UV regions while avoiding inconsistencies induced by direct projection.
  • ...and 5 more figures