Table of Contents
Fetching ...

SS-SfP:Neural Inverse Rendering for Self Supervised Shape from (Mixed) Polarization

Ashish Tiwari, Shanmuganathan Raman

TL;DR

This work tackles Shape from Polarization (SfP) under mixed polarization in a self-supervised setting. It introduces SS-SfP, a neural inverse rendering framework that first decomposes a single-view polarization image into diffuse and specular reflectance cues and estimates a per-pixel refractive index via nonlinear least squares. A dual-branch encoder–decoder network then predicts surface normals and depth guided by the reflectance cues and view encoding, reconstructing the input polarization measurements to enforce self-supervision. The approach achieves competitive, often state-of-the-art performance on DeepSfP and SPW datasets in both supervised and self-supervised modes, and extends SfP to in-the-wild outdoor scenes without ground-truth normals. This work advances polarization-based 3D reconstruction by handling mixed reflections and eliminating the need for ground-truth normals during training, enabling practical SfP in diverse real-world scenarios.

Abstract

We present a novel inverse rendering-based framework to estimate the 3D shape (per-pixel surface normals and depth) of objects and scenes from single-view polarization images, the problem popularly known as Shape from Polarization (SfP). The existing physics-based and learning-based methods for SfP perform under certain restrictions, i.e., (a) purely diffuse or purely specular reflections, which are seldom in the real surfaces, (b) availability of the ground truth surface normals for direct supervision that are hard to acquire and are limited by the scanner's resolution, and (c) known refractive index. To overcome these restrictions, we start by learning to separate the partially-polarized diffuse and specular reflection components, which we call reflectance cues, based on a modified polarization reflection model and then estimate shape under mixed polarization through an inverse-rendering based self-supervised deep learning framework called SS-SfP, guided by the polarization data and estimated reflectance cues. Furthermore, we also obtain the refractive index as a non-linear least squares solution. Through extensive quantitative and qualitative evaluation, we establish the efficacy of the proposed framework over simple single-object scenes from DeepSfP dataset and complex in-the-wild scenes from SPW dataset in an entirely self-supervised setting. To the best of our knowledge, this is the first learning-based approach to address SfP under mixed polarization in a completely self-supervised framework.

SS-SfP:Neural Inverse Rendering for Self Supervised Shape from (Mixed) Polarization

TL;DR

This work tackles Shape from Polarization (SfP) under mixed polarization in a self-supervised setting. It introduces SS-SfP, a neural inverse rendering framework that first decomposes a single-view polarization image into diffuse and specular reflectance cues and estimates a per-pixel refractive index via nonlinear least squares. A dual-branch encoder–decoder network then predicts surface normals and depth guided by the reflectance cues and view encoding, reconstructing the input polarization measurements to enforce self-supervision. The approach achieves competitive, often state-of-the-art performance on DeepSfP and SPW datasets in both supervised and self-supervised modes, and extends SfP to in-the-wild outdoor scenes without ground-truth normals. This work advances polarization-based 3D reconstruction by handling mixed reflections and eliminating the need for ground-truth normals during training, enabling practical SfP in diverse real-world scenarios.

Abstract

We present a novel inverse rendering-based framework to estimate the 3D shape (per-pixel surface normals and depth) of objects and scenes from single-view polarization images, the problem popularly known as Shape from Polarization (SfP). The existing physics-based and learning-based methods for SfP perform under certain restrictions, i.e., (a) purely diffuse or purely specular reflections, which are seldom in the real surfaces, (b) availability of the ground truth surface normals for direct supervision that are hard to acquire and are limited by the scanner's resolution, and (c) known refractive index. To overcome these restrictions, we start by learning to separate the partially-polarized diffuse and specular reflection components, which we call reflectance cues, based on a modified polarization reflection model and then estimate shape under mixed polarization through an inverse-rendering based self-supervised deep learning framework called SS-SfP, guided by the polarization data and estimated reflectance cues. Furthermore, we also obtain the refractive index as a non-linear least squares solution. Through extensive quantitative and qualitative evaluation, we establish the efficacy of the proposed framework over simple single-object scenes from DeepSfP dataset and complex in-the-wild scenes from SPW dataset in an entirely self-supervised setting. To the best of our knowledge, this is the first learning-based approach to address SfP under mixed polarization in a completely self-supervised framework.
Paper Structure (16 sections, 31 equations, 11 figures, 6 tables)

This paper contains 16 sections, 31 equations, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Layer-wise detailed description of the proposed framework - SS-SfP.
  • Figure 2: Qualitative results reflectance separation $(A_{d}, A_{s})$ and recovered polarization information - Angle of Polarization ($\phi_{d}$) and Degree of Polarization ($\rho_{d}, \rho_{s}$).
  • Figure 3: Qualitative results on surface normal estimation over a few objects from DeepSfP dataset ba2020deep.
  • Figure 4: Qualitative results on surface normal estimation on a few scenes from SPW dataset lei2022shape
  • Figure 5: Qualitative results on far-field outdoor scenes. Since ground truth normals are unavailable, we validate the efficacy of estimated normals through the recovered AoP ($\widehat{\phi}$) and DoP ($\widehat{\rho}$).
  • ...and 6 more figures