Neural NeRF Compression
Tuan Pham, Stephan Mandt
TL;DR
This work tackles the storage overhead of grid-based NeRF representations by introducing an encoder-free, per-scene nonlinear transform coding approach that compresses the three TensoRF-VM feature planes with a lightweight decoder. It leverages an importance-weighted loss to focus reconstruction on visually significant regions and introduces a masked, spike-and-slab entropy model to sparsify latent codes. Across four diverse datasets, the method achieves superior rate-distortion performance compared with prior grid-based compression baselines, while adding only minor overhead to rendering. The practical impact is substantial: enabling more compact NeRF models suitable for storage-constrained applications without sacrificing rendering quality.
Abstract
Neural Radiance Fields (NeRFs) have emerged as powerful tools for capturing detailed 3D scenes through continuous volumetric representations. Recent NeRFs utilize feature grids to improve rendering quality and speed; however, these representations introduce significant storage overhead. This paper presents a novel method for efficiently compressing a grid-based NeRF model, addressing the storage overhead concern. Our approach is based on the non-linear transform coding paradigm, employing neural compression for compressing the model's feature grids. Due to the lack of training data involving many i.i.d scenes, we design an encoder-free, end-to-end optimized approach for individual scenes, using lightweight decoders. To leverage the spatial inhomogeneity of the latent feature grids, we introduce an importance-weighted rate-distortion objective and a sparse entropy model employing a masking mechanism. Our experimental results validate that our proposed method surpasses existing works in terms of grid-based NeRF compression efficacy and reconstruction quality.
