GraphicsDreamer: Image to 3D Generation with Physical Consistency
Pei Chen, Fudong Wang, Yixuan Tong, Jingdong Chen, Ming Yang, Minghui Yang
TL;DR
GraphicsDreamer addresses the production-ready 3D asset generation problem from a single image by coupling a six-domain PBR-conditioned diffusion model with a PBR-constrained inverse rendering stage. It jointly models color, geometry, and intrinsic materials (albedo, roughness, metallic) across six views and refines the output with a mixed implicit–explicit surface representation, topology optimization, and UV unwrapping. Quantitative results on the Google Scanned Object dataset show superior novel-view synthesis accuracy and surface reconstruction metrics, and the method supports realistic relighting using environment maps. This approach advances practical, artist-ready 3D content generation suitable for direct use in modern graphics engines. It narrows the gap between automated 3D generation and production pipelines by delivering geometry, textures, and PBR maps in a single framework.
Abstract
Recently, the surge of efficient and automated 3D AI-generated content (AIGC) methods has increasingly illuminated the path of transforming human imagination into complex 3D structures. However, the automated generation of 3D content is still significantly lags in industrial application. This gap exists because 3D modeling demands high-quality assets with sharp geometry, exquisite topology, and physically based rendering (PBR), among other criteria. To narrow the disparity between generated results and artists' expectations, we introduce GraphicsDreamer, a method for creating highly usable 3D meshes from single images. To better capture the geometry and material details, we integrate the PBR lighting equation into our cross-domain diffusion model, concurrently predicting multi-view color, normal, depth images, and PBR materials. In the geometry fusion stage, we continue to enforce the PBR constraints, ensuring that the generated 3D objects possess reliable texture details, supporting realistic relighting. Furthermore, our method incorporates topology optimization and fast UV unwrapping capabilities, allowing the 3D products to be seamlessly imported into graphics engines. Extensive experiments demonstrate that our model can produce high quality 3D assets in a reasonable time cost compared to previous methods.
