Table of Contents
Fetching ...

Neural Scene Baking for Permutation Invariant Transparency Rendering with Real-time Global Illumination

Ziyang Zhang, Edgar Simo-Serra

TL;DR

This work tackles real-time rendering of scenes with transparent objects and global illumination by introducing GlassNet, a neural renderer that separates opaque and transparent G-buffers and uses a permutation-invariant blending function to achieve order-independent transparency. GlassNet comprises a scene encoder, a permutation-invariant transparency buffer blender, a final blending network, and a rendering network, all trained end-to-end to predict indirect lighting alongside direct lighting. The approach delivers real-time performance (e.g., 256×256 at 63 FPS; 512×512 at 32 FPS) while preserving complex transparency shading and textures, and it improves memory efficiency through a symmetric, accumulation-based blending of transparency buffers. Limitations include challenges with strong refraction and participating media, with potential future work in path prediction and hybrid neural- rendering techniques to broaden applicability and fidelity.

Abstract

Neural rendering provides a fundamentally new way to render photorealistic images. Similar to traditional light-baking methods, neural rendering utilizes neural networks to bake representations of scenes, materials, and lights into latent vectors learned from path-tracing ground truths. However, existing neural rendering algorithms typically use G-buffers to provide position, normal, and texture information of scenes, which are prone to occlusion by transparent surfaces, leading to distortions and loss of detail in the rendered images. To address this limitation, we propose a novel neural rendering pipeline that accurately renders the scene behind transparent surfaces with global illumination and variable scenes. Our method separates the G-buffers of opaque and transparent objects, retaining G-buffer information behind transparent objects. Additionally, to render the transparent objects with permutation invariance, we designed a new permutation-invariant neural blending function. We integrate our algorithm into an efficient custom renderer to achieve real-time performance. Our results show that our method is capable of rendering photorealistic images with variable scenes and viewpoints, accurately capturing complex transparent structures along with global illumination. Our renderer can achieve real-time performance ($256\times 256$ at 63 FPS and $512\times 512$ at 32 FPS) on scenes with multiple variable transparent objects.

Neural Scene Baking for Permutation Invariant Transparency Rendering with Real-time Global Illumination

TL;DR

This work tackles real-time rendering of scenes with transparent objects and global illumination by introducing GlassNet, a neural renderer that separates opaque and transparent G-buffers and uses a permutation-invariant blending function to achieve order-independent transparency. GlassNet comprises a scene encoder, a permutation-invariant transparency buffer blender, a final blending network, and a rendering network, all trained end-to-end to predict indirect lighting alongside direct lighting. The approach delivers real-time performance (e.g., 256×256 at 63 FPS; 512×512 at 32 FPS) while preserving complex transparency shading and textures, and it improves memory efficiency through a symmetric, accumulation-based blending of transparency buffers. Limitations include challenges with strong refraction and participating media, with potential future work in path prediction and hybrid neural- rendering techniques to broaden applicability and fidelity.

Abstract

Neural rendering provides a fundamentally new way to render photorealistic images. Similar to traditional light-baking methods, neural rendering utilizes neural networks to bake representations of scenes, materials, and lights into latent vectors learned from path-tracing ground truths. However, existing neural rendering algorithms typically use G-buffers to provide position, normal, and texture information of scenes, which are prone to occlusion by transparent surfaces, leading to distortions and loss of detail in the rendered images. To address this limitation, we propose a novel neural rendering pipeline that accurately renders the scene behind transparent surfaces with global illumination and variable scenes. Our method separates the G-buffers of opaque and transparent objects, retaining G-buffer information behind transparent objects. Additionally, to render the transparent objects with permutation invariance, we designed a new permutation-invariant neural blending function. We integrate our algorithm into an efficient custom renderer to achieve real-time performance. Our results show that our method is capable of rendering photorealistic images with variable scenes and viewpoints, accurately capturing complex transparent structures along with global illumination. Our renderer can achieve real-time performance ( at 63 FPS and at 32 FPS) on scenes with multiple variable transparent objects.
Paper Structure (20 sections, 5 equations, 13 figures, 7 tables)

This paper contains 20 sections, 5 equations, 13 figures, 7 tables.

Figures (13)

  • Figure 1: Overview of our neural scene rendering framework. First, our rasterization-based renderer will first render the G-buffer, direct lighting, and transparency buffers. Then, a neural network, which we denote as GlassNet, will use those buffers and rendering results as inputs to synthesize high quality images with global illumination and accurate transparency. Details of GlassNet can be found in \ref{['sec:NeuralRenderer']} and \ref{['fig:net_structure']}.
  • Figure 2: Architecture of GlassNet. Our proposed method, GlassNet, contains four building blocks including the scene encoder, $\mathcal{F}$; the permutation invariant transparency buffer blending function, $\mathcal{T}$; the final blending network, $\mathcal{B}$; and the rendering network, $\mathcal{R}$. All parts are trained jointly.
  • Figure 3: Our G-buffer scene representation approach. Instead of naively rendering the entire scene to a single set of G-buffers, we use separate buffers for the transparent objects, allowing us to represent complex transparency visibility in scenes accurately.
  • Figure 4: G-buffer, transparency buffer, and ground truth. Our method can overcome the noise in the ground truth.
  • Figure 5: Randomly selected images from datasets. The transparent areas are highlighted by the blue masks in the second row.
  • ...and 8 more figures