Table of Contents
Fetching ...

HERO: Hardware-Efficient RL-based Optimization Framework for NeRF Quantization

Yipu Zhang, Chaofang Ma, Jinming Ge, Lin Jiang, Jiang Xu, Wei Zhang

TL;DR

HERO addresses the challenge of deploying NeRF on hardware by jointly optimizing mixed-precision quantization across hash tables and MLP layers using reinforcement learning guided by a cycle-accurate NeuRex-style simulator. The framework employs a DDPG agent that ingests hardware latency as feedback to balance reconstruction quality (PSNR) and hardware cost, enabling automatic, hardware-aware quantization without expert intervention. Key contributions include a unified observation space for NeRF components, continuous-action bit-width control with precise quantization formulas, and a cycle-accurate simulator integrating memory and caching effects. Experimental results show that HERO delivers up to 1.33× lower latency, up to 1.33× higher cost efficiency, and smaller model sizes compared with state-of-the-art baselines, making NeRF more practical for resource-constrained deployments on specialized accelerators.

Abstract

Neural Radiance Field (NeRF) has emerged as a promising 3D reconstruction method, delivering high-quality results for AR/VR applications. While quantization methods and hardware accelerators have been proposed to enhance NeRF's computational efficiency, existing approaches face crucial limitations. Current quantization methods operate without considering hardware architecture, resulting in sub-optimal solutions within the vast design space encompassing accuracy, latency, and model size. Additionally, existing NeRF accelerators heavily rely on human experts to explore this design space, making the optimization process time-consuming, inefficient, and unlikely to discover optimal solutions. To address these challenges, we introduce HERO, a reinforcement learning framework performing hardware-aware quantization for NeRF. Our framework integrates a NeRF accelerator simulator to generate real-time hardware feedback, enabling fully automated adaptation to hardware constraints. Experimental results demonstrate that HERO achieves 1.31-1.33 $\times$ better latency, 1.29-1.33 $\times$ improved cost efficiency, and a more compact model size compared to CAQ, a previous state-of-the-art NeRF quantization framework. These results validate our framework's capability to effectively navigate the complex design space between hardware and algorithm requirements, discovering superior quantization policies for NeRF implementation. Code is available at https://github.com/ypzhng/HERO.

HERO: Hardware-Efficient RL-based Optimization Framework for NeRF Quantization

TL;DR

HERO addresses the challenge of deploying NeRF on hardware by jointly optimizing mixed-precision quantization across hash tables and MLP layers using reinforcement learning guided by a cycle-accurate NeuRex-style simulator. The framework employs a DDPG agent that ingests hardware latency as feedback to balance reconstruction quality (PSNR) and hardware cost, enabling automatic, hardware-aware quantization without expert intervention. Key contributions include a unified observation space for NeRF components, continuous-action bit-width control with precise quantization formulas, and a cycle-accurate simulator integrating memory and caching effects. Experimental results show that HERO delivers up to 1.33× lower latency, up to 1.33× higher cost efficiency, and smaller model sizes compared with state-of-the-art baselines, making NeRF more practical for resource-constrained deployments on specialized accelerators.

Abstract

Neural Radiance Field (NeRF) has emerged as a promising 3D reconstruction method, delivering high-quality results for AR/VR applications. While quantization methods and hardware accelerators have been proposed to enhance NeRF's computational efficiency, existing approaches face crucial limitations. Current quantization methods operate without considering hardware architecture, resulting in sub-optimal solutions within the vast design space encompassing accuracy, latency, and model size. Additionally, existing NeRF accelerators heavily rely on human experts to explore this design space, making the optimization process time-consuming, inefficient, and unlikely to discover optimal solutions. To address these challenges, we introduce HERO, a reinforcement learning framework performing hardware-aware quantization for NeRF. Our framework integrates a NeRF accelerator simulator to generate real-time hardware feedback, enabling fully automated adaptation to hardware constraints. Experimental results demonstrate that HERO achieves 1.31-1.33 better latency, 1.29-1.33 improved cost efficiency, and a more compact model size compared to CAQ, a previous state-of-the-art NeRF quantization framework. These results validate our framework's capability to effectively navigate the complex design space between hardware and algorithm requirements, discovering superior quantization policies for NeRF implementation. Code is available at https://github.com/ypzhng/HERO.

Paper Structure

This paper contains 17 sections, 13 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Instant NGP muller2022instant process
  • Figure 2: NeuRex-style lee2023neurex accelerator architecture with Bitserial PEs adopted to get hardware feedback in our work
  • Figure 3: Overview for the proposed HERO framework based on DDPG agent
  • Figure 4: Latency comparison: NGP-CAQ vs. HERO
  • Figure 5: Cost Efficiency: NGP-CAQ vs. HERO
  • ...and 1 more figures