MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing

Lingting Zhu; Jingrui Ye; Runze Zhang; Zeyu Hu; Yingda Yin; Lanjiong Li; Jinnan Chen; Shengju Qian; Xin Wang; Qingmin Liao; Lequan Yu

MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing

Lingting Zhu, Jingrui Ye, Runze Zhang, Zeyu Hu, Yingda Yin, Lanjiong Li, Jinnan Chen, Shengju Qian, Xin Wang, Qingmin Liao, Lequan Yu

TL;DR

MuMA tackles 3D PBR texturing by splitting the task into multi-view generation of shaded and albedo channels and a second-stage intrinsic decomposition for remaining materials, enabling high-fidelity, view-consistent textures under lighting variations. The approach leverages SDXL with MV-Adapter for multi-view diffusion, connects shaded outputs to an intrinsic decomposition model (IDArb) to obtain metallic and roughness channels, and employs an agentic post-processing loop with an MLLM (GPT-4o) to score and select the best albedo candidates, including Best-of-N options. Extensive experiments on a large Objaverse-derived dataset show MuMA outperforms baselines in appearance and material fidelity for text-conditioned textured meshes and achieves competitive results for image-conditioned scenarios, while dramatically reducing texture-generation time. The work demonstrates a practical, scalable pipeline for high-quality 3D textures, with implications for faster, more reliable 3D content creation and relighting across diverse lighting conditions.

Abstract

Current methods for 3D generation still fall short in physically based rendering (PBR) texturing, primarily due to limited data and challenges in modeling multi-channel materials. In this work, we propose MuMA, a method for 3D PBR texturing through Multi-channel Multi-view generation and Agentic post-processing. Our approach features two key innovations: 1) We opt to model shaded and albedo appearance channels, where the shaded channels enables the integration intrinsic decomposition modules for material properties. 2) Leveraging multimodal large language models, we emulate artists' techniques for material assessment and selection. Experiments demonstrate that MuMA achieves superior results in visual quality and material fidelity compared to existing methods.

MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing

TL;DR

Abstract

MuMA: 3D PBR Texturing via Multi-Channel Multi-View Generation and Agentic Post-Processing

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)