Table of Contents
Fetching ...

G-NeLF: Memory- and Data-Efficient Hybrid Neural Light Field for Novel View Synthesis

Lutao Jiang, Lin Wang

TL;DR

G-NeLF is proposed, a versatile grid-based NeLF approach that utilizes spatial-aware features to unleash the potential of the neural network's inference capability, and consequently overcome the difficulties of NeLF training.

Abstract

Following the burgeoning interest in implicit neural representation, Neural Light Field (NeLF) has been introduced to predict the color of a ray directly. Unlike Neural Radiance Field (NeRF), NeLF does not create a point-wise representation by predicting color and volume density for each point in space. However, the current NeLF methods face a challenge as they need to train a NeRF model first and then synthesize over 10K views to train NeLF for improved performance. Additionally, the rendering quality of NeLF methods is lower compared to NeRF methods. In this paper, we propose G-NeLF, a versatile grid-based NeLF approach that utilizes spatial-aware features to unleash the potential of the neural network's inference capability, and consequently overcome the difficulties of NeLF training. Specifically, we employ a spatial-aware feature sequence derived from a meticulously crafted grid as the ray's representation. Drawing from our empirical studies on the adaptability of multi-resolution hash tables, we introduce a novel grid-based ray representation for NeLF that can represent the entire space with a very limited number of parameters. To better utilize the sequence feature, we design a lightweight ray color decoder that simulates the ray propagation process, enabling a more efficient inference of the ray's color. G-NeLF can be trained without necessitating significant storage overhead and with the model size of only 0.95 MB to surpass previous state-of-the-art NeLF. Moreover, compared with grid-based NeRF methods, e.g., Instant-NGP, we only utilize one-tenth of its parameters to achieve higher performance. Our code will be released upon acceptance.

G-NeLF: Memory- and Data-Efficient Hybrid Neural Light Field for Novel View Synthesis

TL;DR

G-NeLF is proposed, a versatile grid-based NeLF approach that utilizes spatial-aware features to unleash the potential of the neural network's inference capability, and consequently overcome the difficulties of NeLF training.

Abstract

Following the burgeoning interest in implicit neural representation, Neural Light Field (NeLF) has been introduced to predict the color of a ray directly. Unlike Neural Radiance Field (NeRF), NeLF does not create a point-wise representation by predicting color and volume density for each point in space. However, the current NeLF methods face a challenge as they need to train a NeRF model first and then synthesize over 10K views to train NeLF for improved performance. Additionally, the rendering quality of NeLF methods is lower compared to NeRF methods. In this paper, we propose G-NeLF, a versatile grid-based NeLF approach that utilizes spatial-aware features to unleash the potential of the neural network's inference capability, and consequently overcome the difficulties of NeLF training. Specifically, we employ a spatial-aware feature sequence derived from a meticulously crafted grid as the ray's representation. Drawing from our empirical studies on the adaptability of multi-resolution hash tables, we introduce a novel grid-based ray representation for NeLF that can represent the entire space with a very limited number of parameters. To better utilize the sequence feature, we design a lightweight ray color decoder that simulates the ray propagation process, enabling a more efficient inference of the ray's color. G-NeLF can be trained without necessitating significant storage overhead and with the model size of only 0.95 MB to surpass previous state-of-the-art NeLF. Moreover, compared with grid-based NeRF methods, e.g., Instant-NGP, we only utilize one-tenth of its parameters to achieve higher performance. Our code will be released upon acceptance.
Paper Structure (11 sections, 3 equations, 8 figures, 12 tables, 1 algorithm)

This paper contains 11 sections, 3 equations, 8 figures, 12 tables, 1 algorithm.

Figures (8)

  • Figure 1: A comparison between R2L wang2022r2l and our G-NeLF. a) R2L needs to use a pre-trained NeRF model to synthesize 10,000 training images to train the NeLF model. b) Our G-NeLF only needs the original amount of images the same as NeRF's dataset. c) The comparison of rendering quality and model size among ours, R2L, and NeLF proposed by Attal et al.attal2022learning. The upper half is the Realistic Synthetic 360° dataset. The lower half is scenes containing relatively dense views in the Real Forward-Facing Dataset.
  • Figure 2: An overview of our G-NeLF framework. For one ray emitted from a camera, we first orderly sample a few points on it and use our designed hash multi-resolution triplane to obtain their feature. Then these features will compose the ray's feature sequence to represent this ray. Finally, we design a ray color decoder to transform the ray feature sequence into RGB color.
  • Figure 3: Illustration of ray representation method difference between R2L wang2022r2l and our G-NeLF. R2L directly concatenates frequency encoded mildenhall2020nerf coordinates as the network input. We use feature sequence to represent one ray.
  • Figure 4: Visual comparison of different numbers of feature masks. a) Instant-NGP. b) Mask top 6 resolution in Instant-NGP. c) Mask top 10 resolution in Instant-NGP. Observing the results, it is evident that masking features at higher resolutions still allows for the preservation of overall shape information.
  • Figure 5: Visual comparison with R2L wang2022r2l on Lego, Mic, Ficus, and Ship scene in Realistic Synthetic 360° dataset.
  • ...and 3 more figures