Intrinsic Image Decomposition Using Point Cloud Representation
Xiaoyan Xing, Konrad Groh, Sezer Karaoglu, Theo Gevers
TL;DR
This work tackles intrinsic image decomposition (IID) by moving from 2D imagery to 3D point-cloud representations. It introduces PoInt-Net, a point-based network with three specialized components—Point Albedo-Net, Light Direction Estimation Net, and Learnable Shader—to jointly estimate albedo and shading from colored point clouds, using a two-stage training regime. The approach demonstrates strong efficiency (fewer parameters) and zero-shot generalization, achieving state-of-the-art or competitive results across ShapeNet-Intrinsic, MIT-Intrinsic, MPI-Sintel, Inverender, and IIW datasets, even when trained only on single-object scenes. The findings highlight the advantages of point-cloud-based priors for IID, robust performance under noisy depth, and practical applicability to real-world scenes, with limitations noted and paths for future work outlined.
Abstract
The purpose of intrinsic decomposition is to separate an image into its albedo (reflective properties) and shading components (illumination properties). This is challenging because it's an ill-posed problem. Conventional approaches primarily concentrate on 2D imagery and fail to fully exploit the capabilities of 3D data representation. 3D point clouds offer a more comprehensive format for representing scenes, as they combine geometric and color information effectively. To this end, in this paper, we introduce Point Intrinsic Net (PoInt-Net), which leverages 3D point cloud data to concurrently estimate albedo and shading maps. The merits of PoInt-Net include the following aspects. First, the model is efficient, achieving consistent performance across point clouds of any size with training only required on small-scale point clouds. Second, it exhibits remarkable robustness; even when trained exclusively on datasets comprising individual objects, PoInt-Net demonstrates strong generalization to unseen objects and scenes. Third, it delivers superior accuracy over conventional 2D approaches, demonstrating enhanced performance across various metrics on different datasets. (Code Released)
