Table of Contents
Fetching ...

Factorized Multi-Resolution HashGrid for Efficient Neural Radiance Fields: Execution on Edge-Devices

Kim Jun-Seong, Mingyu Kim, GeonU Kim, Tae-Hyun Oh, Jin-Hwa Kim

Abstract

We introduce Fact-Hash, a novel parameter-encoding method for training on-device neural radiance fields. Neural Radiance Fields (NeRF) have proven pivotal in 3D representations, but their applications are limited due to large computational resources. On-device training can open large application fields, providing strength in communication limitations, privacy concerns, and fast adaptation to a frequently changing scene. However, challenges such as limited resources (GPU memory, storage, and power) impede their deployment. To handle this, we introduce Fact-Hash, a novel parameter-encoding merging Tensor Factorization and Hash-encoding techniques. This integration offers two benefits: the use of rich high-resolution features and the few-shot robustness. In Fact-Hash, we project 3D coordinates into multiple lower-dimensional forms (2D or 1D) before applying the hash function and then aggregate them into a single feature. Comparative evaluations against state-of-the-art methods demonstrate Fact-Hash's superior memory efficiency, preserving quality and rendering speed. Fact-Hash saves memory usage by over one-third while maintaining the PSNR values compared to previous encoding methods. The on-device experiment validates the superiority of Fact-Hash compared to alternative positional encoding methods in computational efficiency and energy consumption. These findings highlight Fact-Hash as a promising solution to improve feature grid representation, address memory constraints, and improve quality in various applications. Project page: https://facthash.github.io/

Factorized Multi-Resolution HashGrid for Efficient Neural Radiance Fields: Execution on Edge-Devices

Abstract

We introduce Fact-Hash, a novel parameter-encoding method for training on-device neural radiance fields. Neural Radiance Fields (NeRF) have proven pivotal in 3D representations, but their applications are limited due to large computational resources. On-device training can open large application fields, providing strength in communication limitations, privacy concerns, and fast adaptation to a frequently changing scene. However, challenges such as limited resources (GPU memory, storage, and power) impede their deployment. To handle this, we introduce Fact-Hash, a novel parameter-encoding merging Tensor Factorization and Hash-encoding techniques. This integration offers two benefits: the use of rich high-resolution features and the few-shot robustness. In Fact-Hash, we project 3D coordinates into multiple lower-dimensional forms (2D or 1D) before applying the hash function and then aggregate them into a single feature. Comparative evaluations against state-of-the-art methods demonstrate Fact-Hash's superior memory efficiency, preserving quality and rendering speed. Fact-Hash saves memory usage by over one-third while maintaining the PSNR values compared to previous encoding methods. The on-device experiment validates the superiority of Fact-Hash compared to alternative positional encoding methods in computational efficiency and energy consumption. These findings highlight Fact-Hash as a promising solution to improve feature grid representation, address memory constraints, and improve quality in various applications. Project page: https://facthash.github.io/

Paper Structure

This paper contains 27 sections, 7 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Comparison of Instant-ngp muller2022instant, TensoRF chen2022tensorf, K-planes fridovich2023k and Ours in terms of PSNR and inference time on the edge-device. Training is conducted on a standard GPU machine, whereas the inference is performed on the edge-device aligning with standard edge-device utilization practices. The area of each circle represents the model size.
  • Figure 2: Conceptual illustration of parameter-encoding; iNGP muller2022instant, TensoRF chen2022tensorf, K-planes fridovich2023k and the proposed method.
  • Figure 3: Schematic of the proposed method, Fact-Hash. For a given point $P$, we first project the point into the tri-plane and obtain surrounding multi-resolution 2D indices of each plane. For the multi-resolution 2D indices, we lookup the corresponding features from the assigned hash tables, and perform bi-linear interpolation to compose 3 multi-resolution feature vectors $f_{xy}, f_{yz},$ and $f_{zx}$. The interpolated feature vectors are multiplied with Hadamard product and decoded to the density $\sigma$ and the color $c$ with the shallow MLP decoders. We predict the color of ray using standard volumetric rendering formula with the densities and the colors along the ray and minimize the reconstruction loss with the regularization.
  • Figure 4: Qualitative results of 8 views case on the NeRF synthetic dataset. Rendered images are results of chair, hotdog cases in the NeRF synthetic dataset by iNGP, TensoRF, K-planes, and ours.
  • Figure 5: PSNR (a) and SSIM (b) values according to the number of uniformly sampled inputs. All metrics are average value of 7 NeRF synthetic data.
  • ...and 5 more figures