CodecNeRF: Toward Fast Encoding and Decoding, Compact, and High-quality Novel-view Synthesis
Gyeongjin Kang, Younggeun Lee, Seungjun Oh, Eunbyung Park
TL;DR
CodecNeRF addresses the challenge of turning NeRF into ubiquitously transmittable 3D media by coupling a forward-pass encoder–decoder with test-time finetuning. It introduces 3D feature construction from multi-view images, vector-quantized 3D feature compression into multi-resolution triplanes, and a two-headed MLP renderer, augmented with parameter-efficient fine-tuning (PEFT) and entropy coding of deltas. Empirical results on Objaverse, Google Scanned Objects, and DTU show up to 100x compression with faster encoding and comparable or better image quality than strong baselines. This work enables practical 3D content delivery over networks and opens avenues for further compression and 3D codec research.
Abstract
Neural Radiance Fields (NeRF) have achieved huge success in effectively capturing and representing 3D objects and scenes. However, to establish a ubiquitous presence in everyday media formats, such as images and videos, we need to fulfill three key objectives: 1. fast encoding and decoding time, 2. compact model sizes, and 3. high-quality renderings. Despite recent advancements, a comprehensive algorithm that adequately addresses all objectives has yet to be fully realized. In this work, we present CodecNeRF, a neural codec for NeRF representations, consisting of an encoder and decoder architecture that can generate a NeRF representation in a single forward pass. Furthermore, inspired by the recent parameter-efficient finetuning approaches, we propose a finetuning method to efficiently adapt the generated NeRF representations to a new test instance, leading to high-quality image renderings and compact code sizes. The proposed CodecNeRF, a newly suggested encoding-decoding-finetuning pipeline for NeRF, achieved unprecedented compression performance of more than 100x and remarkable reduction in encoding time while maintaining (or improving) the image quality on widely used 3D object datasets.
