Table of Contents
Fetching ...

Real-Time Neural Appearance Models

Tizian Zeltner, Fabrice Rousselle, Andrea Weidlich, Petrik Clarberg, Jan Novák, Benedikt Bitterli, Alex Evans, Tomáš Davidovič, Simon Kallweit, Aaron Lefohn

TL;DR

This work presents a complete system for real-time rendering of scenes with complex appearance previously reserved for offline use, and shows that it is possible to inline and execute the neural decoders efficiently inside a real-time path tracer.

Abstract

We present a complete system for real-time rendering of scenes with complex appearance previously reserved for offline use. This is achieved with a combination of algorithmic and system level innovations. Our appearance model utilizes learned hierarchical textures that are interpreted using neural decoders, which produce reflectance values and importance-sampled directions. To best utilize the modeling capacity of the decoders, we equip the decoders with two graphics priors. The first prior -- transformation of directions into learned shading frames -- facilitates accurate reconstruction of mesoscale effects. The second prior -- a microfacet sampling distribution -- allows the neural decoder to perform importance sampling efficiently. The resulting appearance model supports anisotropic sampling and level-of-detail rendering, and allows baking deeply layered material graphs into a compact unified neural representation. By exposing hardware accelerated tensor operations to ray tracing shaders, we show that it is possible to inline and execute the neural decoders efficiently inside a real-time path tracer. We analyze scalability with increasing number of neural materials and propose to improve performance using code optimized for coherent and divergent execution. Our neural material shaders can be over an order of magnitude faster than non-neural layered materials. This opens up the door for using film-quality visuals in real-time applications such as games and live previews.

Real-Time Neural Appearance Models

TL;DR

This work presents a complete system for real-time rendering of scenes with complex appearance previously reserved for offline use, and shows that it is possible to inline and execute the neural decoders efficiently inside a real-time path tracer.

Abstract

We present a complete system for real-time rendering of scenes with complex appearance previously reserved for offline use. This is achieved with a combination of algorithmic and system level innovations. Our appearance model utilizes learned hierarchical textures that are interpreted using neural decoders, which produce reflectance values and importance-sampled directions. To best utilize the modeling capacity of the decoders, we equip the decoders with two graphics priors. The first prior -- transformation of directions into learned shading frames -- facilitates accurate reconstruction of mesoscale effects. The second prior -- a microfacet sampling distribution -- allows the neural decoder to perform importance sampling efficiently. The resulting appearance model supports anisotropic sampling and level-of-detail rendering, and allows baking deeply layered material graphs into a compact unified neural representation. By exposing hardware accelerated tensor operations to ray tracing shaders, we show that it is possible to inline and execute the neural decoders efficiently inside a real-time path tracer. We analyze scalability with increasing number of neural materials and propose to improve performance using code optimized for coherent and divergent execution. Our neural material shaders can be over an order of magnitude faster than non-neural layered materials. This opens up the door for using film-quality visuals in real-time applications such as games and live previews.
Paper Structure (58 sections, 9 equations, 17 figures, 5 tables)

This paper contains 58 sections, 9 equations, 17 figures, 5 tables.

Figures (17)

  • Figure 1: We show rendered images of five reference materials created with a layering approach similar to Jakob2019 that we approximate with neural models for representing the BRDF and importance sampling. All objects are challenging for real-time renderers due to their complex reflection behavior and high resolution textures (see \ref{['tab:material-stats']}). The corresponding shading graphs are provided in the supplementary material.
  • Figure 2: First two columns: approximations of the multi-layer Teapot materials from \ref{['fig:material-overview']} using a simple analytical BRDF, parameterized by only 8 spatially-varying input channels: base color (3), specular roughness (1), specular normal map (2), specularity (1), and metallness (1). Third column: our neural BRDF parameterized by an 8-channel latent texture. FLIP visualizations emphasize the perceptual differences against the reference (last column, \ref{['fig:material-overview']}, \ref{['tab:material-stats']}). The parameters for the analytic BRDF are either numerically optimized or tuned manually. In both cases, we see a much larger approximation error as it lacks the expressive power to capture the complexity of the reference, e.g. the view-dependent blue color of the ceramic glazing.
  • Figure 3: We use our neural BRDFs in a renderer as follows: for each ray that hits a surface with a neural BRDF, we perform standard $(u,v)$ and MIP level $l$ computation, and query the latent texture of the neural material. Then we input the latent code $\mathbf{z}(\mathbf{x})$ into one or two neural decoders, depending on the needs of the rendering algorithm. The BRDF decoder (top box) first extracts two shading frames from $\mathbf{z}(\mathbf{x})$, transforms directions ${\bm{\omega}_\mathrm{i}}$ and ${\bm{\omega}_\mathrm{o}}$ into each of them, and passes the transformed directions and $\mathbf{z}(\mathbf{x})$ to an MLP that predicts the BRDF value (and optionally the directional albedo). The importance sampler (bottom box) extracts parameters of an analytical, two-lobe distribution, which is then sampled for an outgoing direction ${\bm{\omega}_\mathrm{o}}$, and/or evaluated for PDF $p(\mathbf{x},{\bm{\omega}_\mathrm{i}},{\bm{\omega}_\mathrm{o}})$.
  • Figure 4: Highly detailed materials will alias significantly when rendered without supersampling (left columns, unfiltered). Supersampling averages high frequency glints and produces a filtered material, but at impractical sample cost for real-time (right columns, ground truth at 512 SPP). Our neural material can render filtered materials without aliasing at any distance, without supersampling (middle columns, ours).
  • Figure 5: We optimize our model by uniformly sampling the UV domain of the reference material. We start by fetching surface parameters (e.g., albedo) encoding them using an MLP to a latent code, and interpreting it as a BRDF value using the decoder (path marked with 1). Once the encoder is sufficiently trained, we construct the latent texture 2 by processing all texels, and then drop the encoder. We continue "finetuning" the latent texture by sampling the UV space and MIP levels of the texture and optimizing the texels directly 3. We sample exponentially distributed filter footprints to optimize all levels of the latent texture, and train the decoder with prefiltered versions of the input material.
  • ...and 12 more figures