MatE: Material Extraction from Single-Image via Geometric Prior
Zeyu Zhang, Wei Zhai, Jian Yang, Yang Cao
TL;DR
MatE presents a coarse-to-fine framework for recovering tileable PBR material maps from a single image by first rectifying perspective distortions using a geometric prior and depth, then refining residual distortions with a dual-branch diffusion model conditioned through KV injection. Training relies on rotation-aligned synthetic data to bridge the synthetic-real domain gap and enforce consistent material orientation. Empirical results on synthetic and real datasets show state-of-the-art performance in perceptual metrics (LPIPS, CLIP) while maintaining competitive structural fidelity (SSIM), highlighting robustness to viewpoint and illumination variations. The work offers a practical, diffusion-based path toward democratizing high-quality material extraction for real-world graphics pipelines, with insights on tileability and implementation trade-offs.
Abstract
The creation of high-fidelity, physically-based rendering (PBR) materials remains a bottleneck in many graphics pipelines, typically requiring specialized equipment and expert-driven post-processing. To democratize this process, we present MatE, a novel method for generating tileable PBR materials from a single image taken under unconstrained, real-world conditions. Given an image and a user-provided mask, MatE first performs coarse rectification using an estimated depth map as a geometric prior, and then employs a dual-branch diffusion model. Leveraging a learned consistency from rotation-aligned and scale-aligned training data, this model further rectify residual distortions from the coarse result and translate it into a complete set of material maps, including albedo, normal, roughness and height. Our framework achieves invariance to the unknown illumination and perspective of the input image, allowing for the recovery of intrinsic material properties from casual captures. Through comprehensive experiments on both synthetic and real-world data, we demonstrate the efficacy and robustness of our approach, enabling users to create realistic materials from real-world image.
