Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

Liwen Wu; Sai Bi; Zexiang Xu; Fujun Luan; Kai Zhang; Iliyan Georgiev; Kalyan Sunkavalli; Ravi Ramamoorthi

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

Liwen Wu, Sai Bi, Zexiang Xu, Fujun Luan, Kai Zhang, Iliyan Georgiev, Kalyan Sunkavalli, Ravi Ramamoorthi

TL;DR

This work tackles the challenge of high-fidelity view-dependent appearance for specular objects in NeRF-based rendering. It introduces Neural Directional Encoding (NDE), combining a cubemap-based far-field feature grid with cone-traced near-field spatial features to form a learnable, spatially varying directional encoding that reduces reliance on large MLPs. The approach achieves state-of-the-art or competitive results for view synthesis of glossy materials, supports real-time inference with compact decoders, and offers editability by separating near- and far-field reflections. This methodology enhances both the quality and practicality of neural rendering for scenes with complex interreflections and environment-based lighting, with potential applications in neural materials and radiance caching.

Abstract

Novel-view synthesis of specular objects like shiny metals or glossy paints remains a significant challenge. Not only the glossy appearance but also global illumination effects, including reflections of other objects in the environment, are critical components to faithfully reproduce a scene. In this paper, we present Neural Directional Encoding (NDE), a view-dependent appearance encoding of neural radiance fields (NeRF) for rendering specular objects. NDE transfers the concept of feature-grid-based spatial encoding to the angular domain, significantly improving the ability to model high-frequency angular signals. In contrast to previous methods that use encoding functions with only angular input, we additionally cone-trace spatial features to obtain a spatially varying directional encoding, which addresses the challenging interreflection effects. Extensive experiments on both synthetic and real datasets show that a NeRF model with NDE (1) outperforms the state of the art on view synthesis of specular objects, and (2) works with small networks to allow fast (real-time) inference. The project webpage and source code are available at: \url{https://lwwu2.github.io/nde/}.

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

TL;DR

Abstract

Paper Structure (34 sections, 17 equations, 15 figures, 12 tables)

This paper contains 34 sections, 17 equations, 15 figures, 12 tables.

Introduction
Related work
Feature-grid-based NeRF.
Rendering specular objects.
Preliminaries
Discussion on directional encoding.
Neural directional encoding
Far-field features
Near-field features
Optimization
Occupancy-grid sampling.
Regularization.
Implementation details.
Experiments
Background and capturer.
...and 19 more sections

Figures (15)

Figure 1: Ours vs. analytical encoding. Methods like Ref-NeRF verbin2022ref use an analytical function to encode viewing directions in large MLPs, failing to model complex reflections (column 1-2 of the insets). Instead, we encode view-dependent effects into feature grids with better interreflection parameterization, successfully reconstructing the details on the teapot and even multi-bounce reflections of the pink ball (3rd column of the insets) with little computational overhead (75 FPS on an NVIDIA 3090 GPU).
Figure 2: Pipeline of our neural directional encoding (NDE). We encode far-field reflections into a cubemap and near-field interreflections into a volume. Both representations store learnable feature vectors to encode direction and are mip-mapped to account for rough reflections. Given a reflected ray, the features are combined by tracing a cone of size proportional to the surface roughness to aggregate spatial features with cubemap features blended as the background. The result is fed into an MLP to output the specular color (\ref{['eq:color-mlp']}).
Figure 3: Our cubemap-based feature encoding requires only a small MLP (2 layers, 64 width) to model details in mirror reflections (3rd image) comparable with IDE verbin2022ref (2nd image; 8 layers, 256 width MLP) that fails when the MLP is small (1st image).
Figure 4: Spatio-spatial encoding (middle) is equivalent to the common spatio-angular encoding (left) of mirror reflections, but it captures the variation of $\mathbf{x}'$ across different $\mathbf{x}$. The idea can be extended to model rough reflections by cone tracing mip-mapped spatial features covered by the reflection cone (right).
Figure 5: Our cone-traced near-field features successfully reconstruct the reflected spheres (2nd column) under novel views, which are overfitted by the angular-only encoding (1st column).
...and 10 more figures

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

TL;DR

Abstract

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

Authors

TL;DR

Abstract

Table of Contents

Figures (15)