Table of Contents
Fetching ...

NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects

Dongqing Wang, Tong Zhang, Sabine Süsstrunk

TL;DR

NEMTO tackles the ill-posed problem of rendering transparent objects with unknown indices of refraction by combining an implicit $SDF$ geometry representation with a neural Ray Bending Network that learns refraction directly from the scene. The method jointly optimizes geometry and appearance under natural illumination, using a differentiable forward renderer powered by an environment map and a suite of losses to disentangle surface shape from refraction. Key contributions include the first end-to-end pipeline for novel-view and relighting of transparent objects with unknown $IOR$, and a neural environment-matting approach that yields robust, high-frequency refraction effects on synthetic and real data. This approach enables realistic rendering of transparent objects in VR/AR settings without controlled lighting or known material indices, improving practical applicability.

Abstract

We propose NEMTO, the first end-to-end neural rendering pipeline to model 3D transparent objects with complex geometry and unknown indices of refraction. Commonly used appearance modeling such as the Disney BSDF model cannot accurately address this challenging problem due to the complex light paths bending through refractions and the strong dependency of surface appearance on illumination. With 2D images of the transparent object as input, our method is capable of high-quality novel view and relighting synthesis. We leverage implicit Signed Distance Functions (SDF) to model the object geometry and propose a refraction-aware ray bending network to model the effects of light refraction within the object. Our ray bending network is more tolerant to geometric inaccuracies than traditional physically-based methods for rendering transparent objects. We provide extensive evaluations on both synthetic and real-world datasets to demonstrate our high-quality synthesis and the applicability of our method.

NEMTO: Neural Environment Matting for Novel View and Relighting Synthesis of Transparent Objects

TL;DR

NEMTO tackles the ill-posed problem of rendering transparent objects with unknown indices of refraction by combining an implicit geometry representation with a neural Ray Bending Network that learns refraction directly from the scene. The method jointly optimizes geometry and appearance under natural illumination, using a differentiable forward renderer powered by an environment map and a suite of losses to disentangle surface shape from refraction. Key contributions include the first end-to-end pipeline for novel-view and relighting of transparent objects with unknown , and a neural environment-matting approach that yields robust, high-frequency refraction effects on synthetic and real data. This approach enables realistic rendering of transparent objects in VR/AR settings without controlled lighting or known material indices, improving practical applicability.

Abstract

We propose NEMTO, the first end-to-end neural rendering pipeline to model 3D transparent objects with complex geometry and unknown indices of refraction. Commonly used appearance modeling such as the Disney BSDF model cannot accurately address this challenging problem due to the complex light paths bending through refractions and the strong dependency of surface appearance on illumination. With 2D images of the transparent object as input, our method is capable of high-quality novel view and relighting synthesis. We leverage implicit Signed Distance Functions (SDF) to model the object geometry and propose a refraction-aware ray bending network to model the effects of light refraction within the object. Our ray bending network is more tolerant to geometric inaccuracies than traditional physically-based methods for rendering transparent objects. We provide extensive evaluations on both synthetic and real-world datasets to demonstrate our high-quality synthesis and the applicability of our method.
Paper Structure (14 sections, 14 equations, 10 figures, 4 tables)

This paper contains 14 sections, 14 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Given as input multi-view images captured under natural illumination, NEMTO is capable of high-quality novel view synthesis and relighting through optimizing an end-to-end neural representation for a transparent object. NEMTO disentangles geometry and illumination-dependent appearance, which previous neural rendering methods, such as PhySG zhang2021physg, cannot.
  • Figure 2: Overview of NEMTO framework. (a) Geometry Network. For each viewing ray $\boldsymbol\rho(t) = \mathbf{o} + t \boldsymbol{\omega_\mathbf{i}}$, we query geometry network $f_\theta$ through sphere tracing for the ray-surface intersection. (b) Ray Bending Network. We map the viewing direction $\boldsymbol{\omega_i}$ directly to the final refracted ray $\boldsymbol{\omega_\mathbf{t}}$ exiting the object geometry with surface normal $\mathbf{n}$ and intersection $\mathbf{x}$ as prior. As we use an environment map as illumination, the radiance evaluated through refraction only depends on the ray direction, not the location that the ray exits from. (c) Forward Rendering. To render $\boldsymbol\rho(t)$, we analytically calculate reflection direction $\boldsymbol{\omega_\mathbf{r}}$ through $\boldsymbol{\omega_i}$ and $\mathbf{n}$. We then use our physically-inspired rendering algorithm with predicted "refractive index" $\eta_\mathbf{t}$ and evaluate the environment map through $\boldsymbol{\omega_\mathbf{t}}$ and $\boldsymbol{\omega_\mathbf{r}}$.
  • Figure 3: Qualitative comparison with baseline methods on Novel View Synthesis. We compare our novel view synthesis on transparent objects with the methods that we identify as most relevant to ours, NeRF mildenhall2020nerf, Eikonal Field bemana2022eikonal, IDR yariv2020multiview, and PhySG zhang2021physg. Our method outperforms the others on the high-frequency details caused by ray refraction.
  • Figure 4: Qualitative results on Relighting for synthetic datasets. We show that our network can faithfully relight the object with unseen environment illumination, unlike PhySG zhang2021physg.
  • Figure 5: Experiments on different transparent media. We show that NEMTO works for different transparent media other than glass. The learned $\eta_\mathbf{t}$ is adaptive to different media and allows our model to synthesize faithful results.
  • ...and 5 more figures