RT-NeRF: Real-Time On-Device Neural Radiance Fields Towards Immersive AR/VR Rendering
Chaojian Li, Sixu Li, Yang Zhao, Wenbo Zhu, Yingyan Celine Lin
TL;DR
RT-NeRF addresses the barrier to real-time on-device NeRF rendering for AR/VR by identifying two core bottlenecks in existing efficient NeRF methods: uniform point sampling and dense embeddings. It advances a two-pronged solution: an algorithmic RT-NeRF that directly computes geometry from non-zero occupancy-grid cubes and employs a coarse view-dependent ordering to skip invisible points, and a dedicated hardware accelerator with a hybrid sparse encoding scheme and specialized decoding units to exploit sparsity. The approach yields massive throughput gains (up to 3,201×) while preserving rendering quality, and achieves substantial energy efficiency improvements on both edge and cloud hardware. This work demonstrates a viable algorithm–hardware co-design path for real-time NeRF, enabling immersive on-device AR/VR experiences and setting a foundation for future sparsity-aware accelerators.
Abstract
Neural Radiance Field (NeRF) based rendering has attracted growing attention thanks to its state-of-the-art (SOTA) rendering quality and wide applications in Augmented and Virtual Reality (AR/VR). However, immersive real-time (> 30 FPS) NeRF based rendering enabled interactions are still limited due to the low achievable throughput on AR/VR devices. To this end, we first profile SOTA efficient NeRF algorithms on commercial devices and identify two primary causes of the aforementioned inefficiency: (1) the uniform point sampling and (2) the dense accesses and computations of the required embeddings in NeRF. Furthermore, we propose RT-NeRF, which to the best of our knowledge is the first algorithm-hardware co-design acceleration of NeRF. Specifically, on the algorithm level, RT-NeRF integrates an efficient rendering pipeline for largely alleviating the inefficiency due to the commonly adopted uniform point sampling method in NeRF by directly computing the geometry of pre-existing points. Additionally, RT-NeRF leverages a coarse-grained view-dependent computing ordering scheme for eliminating the (unnecessary) processing of invisible points. On the hardware level, our proposed RT-NeRF accelerator (1) adopts a hybrid encoding scheme to adaptively switch between a bitmap- or coordinate-based sparsity encoding format for NeRF's sparse embeddings, aiming to maximize the storage savings and thus reduce the required DRAM accesses while supporting efficient NeRF decoding; and (2) integrates both a dual-purpose bi-direction adder & search tree and a high-density sparse search unit to coordinate the two aforementioned encoding formats. Extensive experiments on eight datasets consistently validate the effectiveness of RT-NeRF, achieving a large throughput improvement (e.g., 9.7x - 3,201x) while maintaining the rendering quality as compared with SOTA efficient NeRF solutions.
