Table of Contents
Fetching ...

TensoFlow: Tensorial Flow-based Sampler for Inverse Rendering

Chun Gu, Xiaofei Wei, Li Zhang, Xiatian Zhu

TL;DR

TensoFlow tackles inverse rendering by learning a spatially and directionally aware importance sampler powered by tensorial normalizing flows. The model jointly estimates geometry, material, and lighting, with a two-stage pipeline and a tensorial scene representation that conditions the flow-based sampler on position and reflected direction. By minimizing the cross-entropy between the integrand and the sampler distribution, it achieves substantially lower Monte Carlo variance and improved relighting quality on synthetic and real datasets, outperforming fixed-sampler baselines and prior learned-sampler methods. While effective, the approach incurs additional computational cost due to flow-based sampling. Overall, it offers a flexible, generalizable framework for variance-reduced inverse rendering with practical impact on relighting tasks and scene understanding.

Abstract

Inverse rendering aims to recover scene geometry, material properties, and lighting from multi-view images. Given the complexity of light-surface interactions, importance sampling is essential for the evaluation of the rendering equation, as it reduces variance and enhances the efficiency of Monte Carlo sampling. Existing inverse rendering methods typically use pre-defined non-learnable importance samplers in prior manually, struggling to effectively match the spatially and directionally varied integrand and resulting in high variance and suboptimal performance. To address this limitation, we propose the concept of learning a spatially and directionally aware importance sampler for the rendering equation to accurately and flexibly capture the unconstrained complexity of a typical scene. We further formulate TensoFlow, a generic approach for sampler learning in inverse rendering, enabling to closely match the integrand of the rendering equation spatially and directionally. Concretely, our sampler is parameterized by normalizing flows, allowing both directional sampling of incident light and probability density function (PDF) inference. To capture the characteristics of the sampler spatially, we learn a tensorial representation of the scene space, which imposes spatial conditions, together with reflected direction, leading to spatially and directionally aware sampling distributions. Our model can be optimized by minimizing the difference between the integrand and our normalizing flow. Extensive experiments validate the superiority of TensoFlow over prior alternatives on both synthetic and real-world benchmarks.

TensoFlow: Tensorial Flow-based Sampler for Inverse Rendering

TL;DR

TensoFlow tackles inverse rendering by learning a spatially and directionally aware importance sampler powered by tensorial normalizing flows. The model jointly estimates geometry, material, and lighting, with a two-stage pipeline and a tensorial scene representation that conditions the flow-based sampler on position and reflected direction. By minimizing the cross-entropy between the integrand and the sampler distribution, it achieves substantially lower Monte Carlo variance and improved relighting quality on synthetic and real datasets, outperforming fixed-sampler baselines and prior learned-sampler methods. While effective, the approach incurs additional computational cost due to flow-based sampling. Overall, it offers a flexible, generalizable framework for variance-reduced inverse rendering with practical impact on relighting tasks and scene understanding.

Abstract

Inverse rendering aims to recover scene geometry, material properties, and lighting from multi-view images. Given the complexity of light-surface interactions, importance sampling is essential for the evaluation of the rendering equation, as it reduces variance and enhances the efficiency of Monte Carlo sampling. Existing inverse rendering methods typically use pre-defined non-learnable importance samplers in prior manually, struggling to effectively match the spatially and directionally varied integrand and resulting in high variance and suboptimal performance. To address this limitation, we propose the concept of learning a spatially and directionally aware importance sampler for the rendering equation to accurately and flexibly capture the unconstrained complexity of a typical scene. We further formulate TensoFlow, a generic approach for sampler learning in inverse rendering, enabling to closely match the integrand of the rendering equation spatially and directionally. Concretely, our sampler is parameterized by normalizing flows, allowing both directional sampling of incident light and probability density function (PDF) inference. To capture the characteristics of the sampler spatially, we learn a tensorial representation of the scene space, which imposes spatial conditions, together with reflected direction, leading to spatially and directionally aware sampling distributions. Our model can be optimized by minimizing the difference between the integrand and our normalizing flow. Extensive experiments validate the superiority of TensoFlow over prior alternatives on both synthetic and real-world benchmarks.

Paper Structure

This paper contains 20 sections, 21 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Left: Material and lighting estimation stage of TensoFlow. Given a ray-surface intersection point $\boldsymbol{x}$, we use tensorial encoders to encode latent features for both the material properties and the importance sampler. When evaluating the rendering equation, incident directions are sampled from our learnable importance sampler, which is a frozen copy of the training normalizing flow. The normalizing flow is optimized by minimizing the distribution difference $\mathcal{L}_\mathrm{ce}$ between $q(\boldsymbol{\omega}_i)$ and the integrand. The material properties, parameterized by $\theta_m$, are optimized by minimizing the RGB rendering loss $\mathcal{L}_\mathrm{c}$ of the final rendering integral. Right: Tensorial normalizing flow. With spatial prior $V_f$ and directional prior $\boldsymbol{\omega}_r$, our tensorial normalizing flow, implemented using piecewise-quadratic coupling layers, enables both incident direction sampling and PDF querying.
  • Figure 2: Illustration of a piecewise-quadratic coupling layer, incorporating tensorial latent feature $V_f$ and reflected direction $\boldsymbol{\omega}_r$ as spatial and directional priors.
  • Figure 3: Qualitative comparison of relighting quality on TensoSDF dataset li2024tensosdf.
  • Figure 4: Visualization of material decomposition, normal map, and per-pixel variance in rendering equation evaluation, comparing results with a pre-defined sampler and our proposed tensorial normalizing flow. The per-pixel variance results are scaled by $1000$ for clearer visualization.
  • Figure 5: Qualitative comparison of relighted images with TensoSDF li2024tensosdf on the Stanford-ORB dataset kuang2024stanford.
  • ...and 2 more figures