Table of Contents
Fetching ...

MixRT: Mixed Neural Representations For Real-Time NeRF Rendering

Chaojian Li, Bichen Wu, Peter Vajda, Yingyan Celine Lin

TL;DR

MixRT introduces a mixed neural representation for real-time NeRF rendering by combining a low-quality mesh, a view-dependent displacement map, and a compressed NeRF in a hash-table. This design leverages rasterizers, texture units, and SIMD on common hardware and WebGL, mapping ray-mesh intersections through a SH-based calibration before querying a hash-table Embedding-to-color pipeline. Empirical results on Unbounded-360 indoor scenes show MixRT achieving real-time >30 FPS at 1280×720 with PSNR improvements (~0.2 dB) and reduced storage (~80% of SOTA) compared with prior real-time methods. The approach demonstrates that high geometric complexity is not strictly necessary for photorealistic rendering, offering a practical route to edge-device NeRF applications.

Abstract

Neural Radiance Field (NeRF) has emerged as a leading technique for novel view synthesis, owing to its impressive photorealistic reconstruction and rendering capability. Nevertheless, achieving real-time NeRF rendering in large-scale scenes has presented challenges, often leading to the adoption of either intricate baked mesh representations with a substantial number of triangles or resource-intensive ray marching in baked representations. We challenge these conventions, observing that high-quality geometry, represented by meshes with substantial triangles, is not necessary for achieving photorealistic rendering quality. Consequently, we propose MixRT, a novel NeRF representation that includes a low-quality mesh, a view-dependent displacement map, and a compressed NeRF model. This design effectively harnesses the capabilities of existing graphics hardware, thus enabling real-time NeRF rendering on edge devices. Leveraging a highly-optimized WebGL-based rendering framework, our proposed MixRT attains real-time rendering speeds on edge devices (over 30 FPS at a resolution of 1280 x 720 on a MacBook M1 Pro laptop), better rendering quality (0.2 PSNR higher in indoor scenes of the Unbounded-360 datasets), and a smaller storage size (less than 80% compared to state-of-the-art methods).

MixRT: Mixed Neural Representations For Real-Time NeRF Rendering

TL;DR

MixRT introduces a mixed neural representation for real-time NeRF rendering by combining a low-quality mesh, a view-dependent displacement map, and a compressed NeRF in a hash-table. This design leverages rasterizers, texture units, and SIMD on common hardware and WebGL, mapping ray-mesh intersections through a SH-based calibration before querying a hash-table Embedding-to-color pipeline. Empirical results on Unbounded-360 indoor scenes show MixRT achieving real-time >30 FPS at 1280×720 with PSNR improvements (~0.2 dB) and reduced storage (~80% of SOTA) compared with prior real-time methods. The approach demonstrates that high geometric complexity is not strictly necessary for photorealistic rendering, offering a practical route to edge-device NeRF applications.

Abstract

Neural Radiance Field (NeRF) has emerged as a leading technique for novel view synthesis, owing to its impressive photorealistic reconstruction and rendering capability. Nevertheless, achieving real-time NeRF rendering in large-scale scenes has presented challenges, often leading to the adoption of either intricate baked mesh representations with a substantial number of triangles or resource-intensive ray marching in baked representations. We challenge these conventions, observing that high-quality geometry, represented by meshes with substantial triangles, is not necessary for achieving photorealistic rendering quality. Consequently, we propose MixRT, a novel NeRF representation that includes a low-quality mesh, a view-dependent displacement map, and a compressed NeRF model. This design effectively harnesses the capabilities of existing graphics hardware, thus enabling real-time NeRF rendering on edge devices. Leveraging a highly-optimized WebGL-based rendering framework, our proposed MixRT attains real-time rendering speeds on edge devices (over 30 FPS at a resolution of 1280 x 720 on a MacBook M1 Pro laptop), better rendering quality (0.2 PSNR higher in indoor scenes of the Unbounded-360 datasets), and a smaller storage size (less than 80% compared to state-of-the-art methods).
Paper Structure (24 sections, 2 equations, 3 figures, 7 tables)

This paper contains 24 sections, 2 equations, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Our proposed MixRT can enable real-time rendering ($>$ 30 FPS) at a resolution of 1280 $\times$ 720 on a Macbook M1 Pro laptop with better rendering quality and smaller storage size compared to SotA works on real-time NeRF rendering yariv2023bakedsdfreiser2023merf.
  • Figure 2: An overview of our proposed MixRT rendering pipeline: MixRT integrates three core components: a low-quality mesh, a view-dependent displacement map, and a NeRF model compressed into a hash table. This combination aims to maximize utilization of diverse hardware resources. To render an image pixel: (1) We use rasterizer hardware to perform mesh rasterization, determining the ray-mesh intersection point, $\mathbf{p}$. (2) Leveraging texture mapping units, we use texture coordinates to access maps containing the spherical harmonics (SH) coefficients and scale, computing the calibrated point, $\mathbf{p}_{cali}$. (3) Lastly, $\mathbf{p}_{cali}$ is processed by SIMD units, retrieving embeddings for its eight closest vertices from the 3D grid stored as a hash table. A small MLP network then converts these interpolated embeddings into the final rendered color.
  • Figure 3: Visual comparison between our proposed MixRT and MeRF reiser2023merf, a real-time NeRF rendering work with SotA rendering quality vs. efficiency trade-offs. The rendered images are randomly selected from the test set.