Table of Contents
Fetching ...

Compressible-composable NeRF via Rank-residual Decomposition

Jiaxiang Tang, Xiaokang Chen, Jingbo Wang, Gang Zeng

TL;DR

This work addresses the manipulation and storage challenges of NeRF representations by proposing an explicit neural field built from tensor rank decomposition. It introduces rank-residual learning to preserve essential information in the leading ranks and a rank-truncation mechanism to adjust detail levels without retraining, enabling dynamic level-of-detail. The model supports arbitrary composition by concatenating rank components across objects and per-object affine transforms, avoiding shared renderers or retraining. Empirically, it achieves near-optimal compression and competitive rendering quality while enabling efficient editing and assembly of multi-object scenes. The approach is particularly relevant for editors and pipelines requiring editable, storage-efficient NeRFs and multi-object composition.

Abstract

Neural Radiance Field (NeRF) has emerged as a compelling method to represent 3D objects and scenes for photo-realistic rendering. However, its implicit representation causes difficulty in manipulating the models like the explicit mesh representation. Several recent advances in NeRF manipulation are usually restricted by a shared renderer network, or suffer from large model size. To circumvent the hurdle, in this paper, we present an explicit neural field representation that enables efficient and convenient manipulation of models. To achieve this goal, we learn a hybrid tensor rank decomposition of the scene without neural networks. Motivated by the low-rank approximation property of the SVD algorithm, we propose a rank-residual learning strategy to encourage the preservation of primary information in lower ranks. The model size can then be dynamically adjusted by rank truncation to control the levels of detail, achieving near-optimal compression without extra optimization. Furthermore, different models can be arbitrarily transformed and composed into one scene by concatenating along the rank dimension. The growth of storage cost can also be mitigated by compressing the unimportant objects in the composed scene. We demonstrate that our method is able to achieve comparable rendering quality to state-of-the-art methods, while enabling extra capability of compression and composition. Code will be made available at https://github.com/ashawkey/CCNeRF.

Compressible-composable NeRF via Rank-residual Decomposition

TL;DR

This work addresses the manipulation and storage challenges of NeRF representations by proposing an explicit neural field built from tensor rank decomposition. It introduces rank-residual learning to preserve essential information in the leading ranks and a rank-truncation mechanism to adjust detail levels without retraining, enabling dynamic level-of-detail. The model supports arbitrary composition by concatenating rank components across objects and per-object affine transforms, avoiding shared renderers or retraining. Empirically, it achieves near-optimal compression and competitive rendering quality while enabling efficient editing and assembly of multi-object scenes. The approach is particularly relevant for editors and pipelines requiring editable, storage-efficient NeRFs and multi-object composition.

Abstract

Neural Radiance Field (NeRF) has emerged as a compelling method to represent 3D objects and scenes for photo-realistic rendering. However, its implicit representation causes difficulty in manipulating the models like the explicit mesh representation. Several recent advances in NeRF manipulation are usually restricted by a shared renderer network, or suffer from large model size. To circumvent the hurdle, in this paper, we present an explicit neural field representation that enables efficient and convenient manipulation of models. To achieve this goal, we learn a hybrid tensor rank decomposition of the scene without neural networks. Motivated by the low-rank approximation property of the SVD algorithm, we propose a rank-residual learning strategy to encourage the preservation of primary information in lower ranks. The model size can then be dynamically adjusted by rank truncation to control the levels of detail, achieving near-optimal compression without extra optimization. Furthermore, different models can be arbitrarily transformed and composed into one scene by concatenating along the rank dimension. The growth of storage cost can also be mitigated by compressing the unimportant objects in the composed scene. We demonstrate that our method is able to achieve comparable rendering quality to state-of-the-art methods, while enabling extra capability of compression and composition. Code will be made available at https://github.com/ashawkey/CCNeRF.
Paper Structure (29 sections, 10 equations, 10 figures, 5 tables)

This paper contains 29 sections, 10 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Compressibility and Composability of our method. We present a tensor rank decomposition based neural field representation, which supports model compression through rank truncation, and arbitrary composition between different models through rank concatenation. Both of these operations require no extra optimization, or any constraints in training (e.g., a shared renderer).
  • Figure 2: Model structure. Our model is composed of a matrix storing rank weights for different feature channels, and a set of decomposed rank components. Each rank component can be either vector- or matrix-based, and the ratio can be controlled to trade off between model size and performance. To query any 3D coordinate, we first project it to the decomposed vectors or matrices as denoted by the black lines, and then perform weighted interpolation. $||$ denotes concatenation along the rank dimension.
  • Figure 3: Compression at any rank. Combined with the empirical sort-and-truncate strategy, the proposed model achieves near-optimal compression at any rank. We use the HY-S model on the LEGO dataset as an example, and the dashed lines indicate where we apply rank-residual supervision.
  • Figure 4: Visualization of rank importance. Ranks are sorted column-wisely based on the averaged rank importance. The rank importance is more concentrated in the proposed method (right) compared to the baseline (left), which is crucial for truncation-based compression. We use the HY model on the LEGO dataset as an example, and the dashed lines indicate where we apply rank-residual supervision.
  • Figure 5: Compressing a scene composed of multiple objects. For a scene composed of lots of different objects, we can compress the less important objects to achieve better efficiency and less storage with a little sacrifice of rendering quality.
  • ...and 5 more figures