Table of Contents
Fetching ...

SAC-NeRF: Adaptive Ray Sampling for Neural Radiance Fields via Soft Actor-Critic Reinforcement Learning

Chenyu Ge

Abstract

Neural Radiance Fields (NeRF) have achieved photorealistic novel view synthesis but suffer from computational inefficiency due to dense ray sampling during volume rendering. We propose SAC-NeRF, a reinforcement learning framework that learns adaptive sampling policies using Soft Actor-Critic (SAC). Our method formulates sampling as a Markov Decision Process where an RL agent learns to allocate samples based on scene characteristics. We introduce three technical components: (1) a Gaussian mixture distribution color model providing uncertainty estimates, (2) a multi-component reward function balancing quality, efficiency, and consistency, and (3) a two-stage training strategy addressing environment non-stationarity. Experiments on Synthetic-NeRF and LLFF datasets show that SAC-NeRF reduces sampling points by 35-48\% while maintaining rendering quality within 0.3-0.8 dB PSNR of dense sampling baselines. While the learned policy is scene-specific and the RL framework adds complexity compared to simpler heuristics, our work demonstrates that data-driven sampling strategies can discover effective patterns that would be difficult to hand-design.

SAC-NeRF: Adaptive Ray Sampling for Neural Radiance Fields via Soft Actor-Critic Reinforcement Learning

Abstract

Neural Radiance Fields (NeRF) have achieved photorealistic novel view synthesis but suffer from computational inefficiency due to dense ray sampling during volume rendering. We propose SAC-NeRF, a reinforcement learning framework that learns adaptive sampling policies using Soft Actor-Critic (SAC). Our method formulates sampling as a Markov Decision Process where an RL agent learns to allocate samples based on scene characteristics. We introduce three technical components: (1) a Gaussian mixture distribution color model providing uncertainty estimates, (2) a multi-component reward function balancing quality, efficiency, and consistency, and (3) a two-stage training strategy addressing environment non-stationarity. Experiments on Synthetic-NeRF and LLFF datasets show that SAC-NeRF reduces sampling points by 35-48\% while maintaining rendering quality within 0.3-0.8 dB PSNR of dense sampling baselines. While the learned policy is scene-specific and the RL framework adds complexity compared to simpler heuristics, our work demonstrates that data-driven sampling strategies can discover effective patterns that would be difficult to hand-design.
Paper Structure (22 sections, 16 equations, 3 figures, 3 tables)

This paper contains 22 sections, 16 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Qualitative comparison showing SAC-NeRF maintains visual quality while reducing samples. From left to right: Ground Truth, NeRF (192 samples), SAC-NeRF (108 samples). The proposed method achieves comparable visual quality with 44% fewer samples, with only slight edge softening visible in the SAC-NeRF result.
  • Figure 2: Learned sampling distributions showing adaptive concentration near scene geometry. Top: uniform sampling baseline. Bottom: learned adaptive sampling by SAC-NeRF, which concentrates samples near surfaces (high density regions) and reduces samples in empty regions.
  • Figure 3: Training curve showing stable convergence of the RL policy. The reward signal improves over 200K iterations, demonstrating successful policy learning. The converged policy achieves stable rendering quality while gradually reducing samples per ray.