Table of Contents
Fetching ...

Physics-informed Active Polarimetric 3D Imaging for Specular Surfaces

Jiazhang Wang, Hyelim Yang, Tianyi Wang, Florian Willomitzer

TL;DR

This paper tackles the challenge of fast, accurate single-shot 3D imaging of specular surfaces by introducing a physics-informed deep learning framework that blends polarization priors with geometry. A dual-encoder network with FiLM-based cross-modal modulation fuses Stokes/DoLP-derived cues with a coarse camera–screen correspondence to robustly estimate surface normals from a single shot. Key contributions include a two-stage architecture that (i) derives coarse depth/normals from polarimetric inputs and (ii) adaptively weights geometric information via FiLM to mitigate error propagation, achieving $0.79^\circ$ mean angular error and $8\,\mathrm{ms}$ inference on unseen objects, significantly outperforming conventional polarimetric methods. The approach enables practical, deployment-ready 3D imaging of complex specular surfaces in dynamic environments, demonstrated with a real prototype and synthetic Mitsuba3-based training data, while outlining future work on broader materials and sensor-level modeling.

Abstract

3D imaging of specular surfaces remains challenging in real-world scenarios, such as in-line inspection or hand-held scanning, requiring fast and accurate measurement of complex geometries. Optical metrology techniques such as deflectometry achieve high accuracy but typically rely on multi-shot acquisition, making them unsuitable for dynamic environments. Fourier-based single-shot approaches alleviate this constraint, yet their performance deteriorates when measuring surfaces with high spatial frequency structure or large curvature. Alternatively, polarimetric 3D imaging in computer vision operates in a single-shot fashion and exhibits robustness to geometric complexity. However, its accuracy is fundamentally limited by the orthographic imaging assumption. In this paper, we propose a physics-informed deep learning framework for single-shot 3D imaging of complex specular surfaces. Polarization cues provide orientation priors that assist in interpreting geometric information encoded by structured illumination. These complementary cues are processed through a dual-encoder architecture with mutual feature modulation, allowing the network to resolve their nonlinear coupling and directly infer surface normals. The proposed method achieves accurate and robust normal estimation in single-shot with fast inference, enabling practical 3D imaging of complex specular surfaces.

Physics-informed Active Polarimetric 3D Imaging for Specular Surfaces

TL;DR

This paper tackles the challenge of fast, accurate single-shot 3D imaging of specular surfaces by introducing a physics-informed deep learning framework that blends polarization priors with geometry. A dual-encoder network with FiLM-based cross-modal modulation fuses Stokes/DoLP-derived cues with a coarse camera–screen correspondence to robustly estimate surface normals from a single shot. Key contributions include a two-stage architecture that (i) derives coarse depth/normals from polarimetric inputs and (ii) adaptively weights geometric information via FiLM to mitigate error propagation, achieving mean angular error and inference on unseen objects, significantly outperforming conventional polarimetric methods. The approach enables practical, deployment-ready 3D imaging of complex specular surfaces in dynamic environments, demonstrated with a real prototype and synthetic Mitsuba3-based training data, while outlining future work on broader materials and sensor-level modeling.

Abstract

3D imaging of specular surfaces remains challenging in real-world scenarios, such as in-line inspection or hand-held scanning, requiring fast and accurate measurement of complex geometries. Optical metrology techniques such as deflectometry achieve high accuracy but typically rely on multi-shot acquisition, making them unsuitable for dynamic environments. Fourier-based single-shot approaches alleviate this constraint, yet their performance deteriorates when measuring surfaces with high spatial frequency structure or large curvature. Alternatively, polarimetric 3D imaging in computer vision operates in a single-shot fashion and exhibits robustness to geometric complexity. However, its accuracy is fundamentally limited by the orthographic imaging assumption. In this paper, we propose a physics-informed deep learning framework for single-shot 3D imaging of complex specular surfaces. Polarization cues provide orientation priors that assist in interpreting geometric information encoded by structured illumination. These complementary cues are processed through a dual-encoder architecture with mutual feature modulation, allowing the network to resolve their nonlinear coupling and directly infer surface normals. The proposed method achieves accurate and robust normal estimation in single-shot with fast inference, enabling practical 3D imaging of complex specular surfaces.
Paper Structure (3 sections, 1 equation, 3 figures, 1 table)

This paper contains 3 sections, 1 equation, 3 figures, 1 table.

Figures (3)

  • Figure 1: Overview of the proposed physics-informed learning framework. Polarimetric inputs, including Stokes parameters and DoLP, are first processed by U-Nets to obtain coarse depth and normal estimates. Then the coarse correspondence map is analytically calculated. These physics priors are processed separately through two encoder branches to extract modality-specific features Feature-wise Linear Modulation layers are employed to adaptively fuse polarimetric cues and geometric correspondence features, enabling robust normal estimation.
  • Figure 2: Real experimental prototype and qualitative comparison. (a) The prototype consists of a polarization camera and an unpolarized display. (b) Surface normal estimated by the proposed method with single-shot capture. (c) Surface normal estimated using the previous physics-based method wang20253d with multi-shot sequential captures. The proposed method produces a more consistent normal field, especially in facial region of the horse.
  • Figure 3: Quantitative evaluation. (a) Sample input image under cross-sinusoidal illumination. (b) Estimated surface normal map using the proposed method. (c) Angular error map of the conventional polarimetric 3D imaging method atkinson2006recovery. (d) Angular error map of the proposed method.