Table of Contents
Fetching ...

FrameNeRF: A Simple and Efficient Framework for Few-shot Novel View Synthesis

Yan Xing, Pan Wang, Ligang Liu, Daolun Li, Li Zhang

TL;DR

FrameNeRF addresses the challenge of few-shot novel view synthesis by combining a regularization-based data generator with a fast high-fidelity NeRF. The method uses a three-stage pipeline: generate pseudo-dense views from sparse inputs via a regularization model, train a fast NeRF on these views, and then fine-tune on the original sparse views to refine details. It achieves state-of-the-art results across Blender, LLFF, and DTU benchmarks, demonstrating strong rendering quality and robust multi-view consistency while maintaining fast training. The framework is modular and flexible, enabling the substitution of different regularization and fast-NeRF components to adapt to new data domains and performance targets.

Abstract

We present a novel framework, called FrameNeRF, designed to apply off-the-shelf fast high-fidelity NeRF models with fast training speed and high rendering quality for few-shot novel view synthesis tasks. The training stability of fast high-fidelity models is typically constrained to dense views, making them unsuitable for few-shot novel view synthesis tasks. To address this limitation, we utilize a regularization model as a data generator to produce dense views from sparse inputs, facilitating subsequent training of fast high-fidelity models. Since these dense views are pseudo ground truth generated by the regularization model, original sparse images are then used to fine-tune the fast high-fidelity model. This process helps the model learn realistic details and correct artifacts introduced in earlier stages. By leveraging an off-the-shelf regularization model and a fast high-fidelity model, our approach achieves state-of-the-art performance across various benchmark datasets.

FrameNeRF: A Simple and Efficient Framework for Few-shot Novel View Synthesis

TL;DR

FrameNeRF addresses the challenge of few-shot novel view synthesis by combining a regularization-based data generator with a fast high-fidelity NeRF. The method uses a three-stage pipeline: generate pseudo-dense views from sparse inputs via a regularization model, train a fast NeRF on these views, and then fine-tune on the original sparse views to refine details. It achieves state-of-the-art results across Blender, LLFF, and DTU benchmarks, demonstrating strong rendering quality and robust multi-view consistency while maintaining fast training. The framework is modular and flexible, enabling the substitution of different regularization and fast-NeRF components to adapt to new data domains and performance targets.

Abstract

We present a novel framework, called FrameNeRF, designed to apply off-the-shelf fast high-fidelity NeRF models with fast training speed and high rendering quality for few-shot novel view synthesis tasks. The training stability of fast high-fidelity models is typically constrained to dense views, making them unsuitable for few-shot novel view synthesis tasks. To address this limitation, we utilize a regularization model as a data generator to produce dense views from sparse inputs, facilitating subsequent training of fast high-fidelity models. Since these dense views are pseudo ground truth generated by the regularization model, original sparse images are then used to fine-tune the fast high-fidelity model. This process helps the model learn realistic details and correct artifacts introduced in earlier stages. By leveraging an off-the-shelf regularization model and a fast high-fidelity model, our approach achieves state-of-the-art performance across various benchmark datasets.
Paper Structure (15 sections, 3 equations, 8 figures, 5 tables)

This paper contains 15 sections, 3 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Example novel view synthesis results from sparse views. The regularization method alone or the fast high-fidelity method does not perform well enough in the sparse view. Therefore, we use the regularization model and the fast high-fidelity model as components of our framework to take full advantage of their strengths and achieve a significant performance improvement in rendering quality
  • Figure 2: Overviews of our method. We have divided the whole training process into three stages. Regularization stage: the given sparse views are used as inputs to train the regularization model, and the dense multi-view data are generated from the trained model. Intermediate training stage: the generated pseudo-dense views are used as input to train the fast high-fidelity model. Fine-tuning stage: the original real sparse input views are used to fine-tune the high-fidelity model.
  • Figure 3: The rendering results in different stages from different viewpoints.
  • Figure 4: Impact of the fine-tuning process. The top row shows the results with artifacts generated from the regularization model during the first stage. The bottom row depicts the results without artifacts from the fast high-fidelity model after fine-tuning.
  • Figure 5: Qualitative comparisons on the Blender dataset. FrameNeRF can learn more details and realistic colors.
  • ...and 3 more figures