Table of Contents
Fetching ...

Deep Polarization Cues for Single-shot Shape and Subsurface Scattering Estimation

Chenhao Li, Trung Thanh Ngo, Hajime Nagahara

TL;DR

This work addresses the problem of jointly estimating geometry and homogeneous subsurface scattering (SSS) parameters for translucent objects from polarization cues in a single shot. It introduces a stage-wise network that ingests four linearly polarized images $I_0, I_{45}, I_{90}, I_{135}$ plus derived $I_{ ext{max}}$, $I_{ ext{min}}$ and a mask $M$ to predict shape $(N,D)$ and illumination $( ext{sh}, i)$, followed by SSS parameters $( ilde{\sigma_t}, ilde{\alpha}, ilde{g})$ guided by the estimated shape/illumination, with a reconstruction network trained using four pure BSDF polarized images. A large synthetic dataset of $117{,}000$ polarized scenes is built from ShapeNet objects, bump maps, and Laval Indoor HDR lighting to train the model, and the approach outperforms SfP baselines and a prior SSS method on both synthetic and real data. The work demonstrates that polarization cues, particularly the proposed $I_{ ext{max}}/I_{ ext{min}}$ representations, can effectively disambiguate surface and subsurface contributions, enabling robust, single-shot estimation of complex translucent materials.

Abstract

In this work, we propose a novel learning-based method to jointly estimate the shape and subsurface scattering (SSS) parameters of translucent objects by utilizing polarization cues. Although polarization cues have been used in various applications, such as shape from polarization (SfP), BRDF estimation, and reflection removal, their application in SSS estimation has not yet been explored. Our observations indicate that the SSS affects not only the light intensity but also the polarization signal. Hence, the polarization signal can provide additional cues for SSS estimation. We also introduce the first large-scale synthetic dataset of polarized translucent objects for training our model. Our method outperforms several baselines from the SfP and inverse rendering realms on both synthetic and real data, as demonstrated by qualitative and quantitative results.

Deep Polarization Cues for Single-shot Shape and Subsurface Scattering Estimation

TL;DR

This work addresses the problem of jointly estimating geometry and homogeneous subsurface scattering (SSS) parameters for translucent objects from polarization cues in a single shot. It introduces a stage-wise network that ingests four linearly polarized images plus derived , and a mask to predict shape and illumination , followed by SSS parameters guided by the estimated shape/illumination, with a reconstruction network trained using four pure BSDF polarized images. A large synthetic dataset of polarized scenes is built from ShapeNet objects, bump maps, and Laval Indoor HDR lighting to train the model, and the approach outperforms SfP baselines and a prior SSS method on both synthetic and real data. The work demonstrates that polarization cues, particularly the proposed representations, can effectively disambiguate surface and subsurface contributions, enabling robust, single-shot estimation of complex translucent materials.

Abstract

In this work, we propose a novel learning-based method to jointly estimate the shape and subsurface scattering (SSS) parameters of translucent objects by utilizing polarization cues. Although polarization cues have been used in various applications, such as shape from polarization (SfP), BRDF estimation, and reflection removal, their application in SSS estimation has not yet been explored. Our observations indicate that the SSS affects not only the light intensity but also the polarization signal. Hence, the polarization signal can provide additional cues for SSS estimation. We also introduce the first large-scale synthetic dataset of polarized translucent objects for training our model. Our method outperforms several baselines from the SfP and inverse rendering realms on both synthetic and real data, as demonstrated by qualitative and quantitative results.
Paper Structure (15 sections, 15 equations, 6 figures, 1 table)

This paper contains 15 sections, 15 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Visual results on real-world translucent objects. Our method takes four linearly polarized images and a binary mask as input. We use Mitsuba nimier2019mitsuba to render a sphere to visualize estimated SSS parameters. For a better understanding of the quality of the estimated shape (Normal and Depth), we provide the shading and relighting results.
  • Figure 2: (a) SSS influences the polarization of objects. Images of a bottle (BSDF) were captured under the same illumination with three different liquids (SSS). For black coffee, most photons are absorbed upon entering the object, and the surface reflection dominates. For the milk, photons are back-scattered, and the SSS contributes significantly to the overall appearance. The latté falls somewhere in between these two extremes. (b) Our scene representation. We assume unpolarized light sources. The captured light intensity consists of two components. One comes from the single-bounce surface reflection, which exhibits specular polarization (orange path). Another one is the refracted light (blue path): An unpolarized light refracts into the object and becomes partially polarized light. Then, after undergoing the multi-bounce SSS, it becomes unpolarized light. Finally, the light leaves the object, undergoes Fresnel refraction again, and exhibits diffuse polarization.
  • Figure 3: Overview of the proposed model. It takes four linearly polarized images ($I_0, I_{45}, I_{90}, I_{135}$), a $max$ & $min$ physical prior ($I_\text{max}, I_\text{min}$), and a binary mask ($M$) as input. We use a stage-wise network architecture, and the model estimates shape and illumination first, then uses the estimated shape and illumination to guide the SSS parameter estimation. A novel reconstruction network whose inputs are four pure BSDF images ($I_{b0}, I_{b45}, I_{b90}, I_{b135}$) and the estimated SSS parameters is proposed to further reduce the ambiguity of SSS estimation. Note that the reconstruction network is only used during the training.
  • Figure 4: Visual comparison of estimated normal with Deep SfP ba2020deep and SfPW lei2022shape. Mean angular errors are provided on the top-left corners.
  • Figure 5: Visual comparison with an inverse scattering method. The 1st row is the GT intensity images. Images in the 2nd row are rendered by the GT shape, illumination, and estimated SSS by Che et al.che2020towards. The 3rd row is the images rendered by SSS parameters estimated by our method. Error maps are provided in the upper right corner.
  • ...and 1 more figures