MATLABER: Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR
Xudong Xu, Zhaoyang Lyu, Xingang Pan, Bo Dai
TL;DR
The paper addresses the difficulty of recovering high-fidelity object materials in text-to-3D generation. It introduces MATLABER, which employs a latent BRDF auto-encoder trained on real BRDF datasets to produce BRDF latent codes that decode to physically plausible BRDF parameters, enabling disentanglement from environment lights and relightable rendering. Appearance is modeled via a material MLP that predicts BRDF latents across a DMTet geometry, guided by Score Distillation Sampling and a Cook–Torrance BRDF framework, with a smooth latent space enforced by KL, smoothness, and cyclic losses. Empirical results show improved realism, detail, and disentanglement over baselines, plus successful relighting and material editing, illustrating the practical impact for 3D content creation. The work also provides a pathway to generalize material priors to downstream tasks, albeit with geometry-related limitations that future work could address with advanced diversification strategies like Variational Score Distillation.
Abstract
Based on powerful text-to-image diffusion models, text-to-3D generation has made significant progress in generating compelling geometry and appearance. However, existing methods still struggle to recover high-fidelity object materials, either only considering Lambertian reflectance, or failing to disentangle BRDF materials from the environment lights. In this work, we propose Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR (\textbf{MATLABER}) that leverages a novel latent BRDF auto-encoder for material generation. We train this auto-encoder with large-scale real-world BRDF collections and ensure the smoothness of its latent space, which implicitly acts as a natural distribution of materials. During appearance modeling in text-to-3D generation, the latent BRDF embeddings, rather than BRDF parameters, are predicted via a material network. Through exhaustive experiments, our approach demonstrates the superiority over existing ones in generating realistic and coherent object materials. Moreover, high-quality materials naturally enable multiple downstream tasks such as relighting and material editing. Code and model will be publicly available at \url{https://sheldontsui.github.io/projects/Matlaber}.
