IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination
Xi Chen, Sida Peng, Dongchen Yang, Yuan Liu, Bowen Pan, Chengfei Lv, Xiaowei Zhou
TL;DR
This work tackles inverse rendering under unknown static illumination by introducing conditional diffusion priors for albedo and specular shading, designed to regularize the inherent ambiguity between material and lighting. The authors implement a two-stage coarse-to-fine optimization: first obtaining a rough material and lighting estimate from diffusion priors, then guiding diffusion samples to achieve multi-view consistency across views. By separating diffuse and specular components and training priors on large 3D-object datasets, the method achieves state-of-the-art material recovery and relighting on synthetic and real data, with good generalization to internet images. The approach offers a practical, data-driven way to resolve one of inverse rendering’s fundamental ambiguities and provides a scalable framework for material and lighting estimation under unknown illumination.
Abstract
This paper aims to recover object materials from posed images captured under an unknown static lighting condition. Recent methods solve this task by optimizing material parameters through differentiable physically based rendering. However, due to the coupling between object geometry, materials, and environment lighting, there is inherent ambiguity during the inverse rendering process, preventing previous methods from obtaining accurate results. To overcome this ill-posed problem, our key idea is to learn the material prior with a generative model for regularizing the optimization process. We observe that the general rendering equation can be split into diffuse and specular shading terms, and thus formulate the material prior as diffusion models of albedo and specular. Thanks to this design, our model can be trained using the existing abundant 3D object data, and naturally acts as a versatile tool to resolve the ambiguity when recovering material representations from RGB images. In addition, we develop a coarse-to-fine training strategy that leverages estimated materials to guide diffusion models to satisfy multi-view consistent constraints, leading to more stable and accurate results. Extensive experiments on real-world and synthetic datasets demonstrate that our approach achieves state-of-the-art performance on material recovery. The code will be available at https://zju3dv.github.io/IntrinsicAnything.
