Table of Contents
Fetching ...

SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields

Yuze Wang, Junyi Wang, Chen Wang, Wantong Duan, Yongtang Bao, Yue Qi

Abstract

This paper introduces a novel continual learning framework for synthesising novel views of multiple scenes, learning multiple 3D scenes incrementally, and updating the network parameters only with the training data of the upcoming new scene. We build on Neural Radiance Fields (NeRF), which uses multi-layer perceptron to model the density and radiance field of a scene as the implicit function. While NeRF and its extensions have shown a powerful capability of rendering photo-realistic novel views in a single 3D scene, managing these growing 3D NeRF assets efficiently is a new scientific problem. Very few works focus on the efficient representation or continuous learning capability of multiple scenes, which is crucial for the practical applications of NeRF. To achieve these goals, our key idea is to represent multiple scenes as the linear combination of a cross-scene weight matrix and a set of scene-specific weight matrices generated from a global parameter generator. Furthermore, we propose an uncertain surface knowledge distillation strategy to transfer the radiance field knowledge of previous scenes to the new model. Representing multiple 3D scenes with such weight matrices significantly reduces memory requirements. At the same time, the uncertain surface distillation strategy greatly overcomes the catastrophic forgetting problem and maintains the photo-realistic rendering quality of previous scenes. Experiments show that the proposed approach achieves state-of-the-art rendering quality of continual learning NeRF on NeRF-Synthetic, LLFF, and TanksAndTemples datasets while preserving extra low storage cost.

SCARF: Scalable Continual Learning Framework for Memory-efficient Multiple Neural Radiance Fields

Abstract

This paper introduces a novel continual learning framework for synthesising novel views of multiple scenes, learning multiple 3D scenes incrementally, and updating the network parameters only with the training data of the upcoming new scene. We build on Neural Radiance Fields (NeRF), which uses multi-layer perceptron to model the density and radiance field of a scene as the implicit function. While NeRF and its extensions have shown a powerful capability of rendering photo-realistic novel views in a single 3D scene, managing these growing 3D NeRF assets efficiently is a new scientific problem. Very few works focus on the efficient representation or continuous learning capability of multiple scenes, which is crucial for the practical applications of NeRF. To achieve these goals, our key idea is to represent multiple scenes as the linear combination of a cross-scene weight matrix and a set of scene-specific weight matrices generated from a global parameter generator. Furthermore, we propose an uncertain surface knowledge distillation strategy to transfer the radiance field knowledge of previous scenes to the new model. Representing multiple 3D scenes with such weight matrices significantly reduces memory requirements. At the same time, the uncertain surface distillation strategy greatly overcomes the catastrophic forgetting problem and maintains the photo-realistic rendering quality of previous scenes. Experiments show that the proposed approach achieves state-of-the-art rendering quality of continual learning NeRF on NeRF-Synthetic, LLFF, and TanksAndTemples datasets while preserving extra low storage cost.
Paper Structure (44 sections, 11 equations, 9 figures, 11 tables)

This paper contains 44 sections, 11 equations, 9 figures, 11 tables.

Figures (9)

  • Figure 1: With the proposed SCARF, given a sequence of 3D scenes, our method factorizes the MLP into a set of scene-specific weight matrices and a cross-scene weight matrix. A global parameter generator generates the scene-specific weight matrices, learning the generalizable features across scenes. Moreover, when a new 3D scene comes, additional parameters needed to introduce it into the network are only random noise and a coefficient matrix.
  • Figure 2: Qualitative results of continual learning previous three scenes on the NeRF-Synthetic dataset.
  • Figure 3: Qualitative results of comparisons with some traditional continual learning methods combined with NeRF. "EWC+NeRF" is "continual learning multiple NeRF with Elastic Weight Consolidation c_reg1", "PackNet+NeRF" is "continual learning multiple NeRF with PackNet packnet", "MEIL-NeRF*" is "continual learning multiple NeRF with MEIL-NeRF n_meil_nerf", and "CL-NeRF*" is "continual learning multiple NeRF with CL-NeRF".
  • Figure 4: Qualitative results of continual learning of five scenes on TanksAndTemples dataset.
  • Figure 5: Qualitative results of continual learning of eight scenes on LLFF dataset.
  • ...and 4 more figures