Table of Contents
Fetching ...

CombiNeRF: A Combination of Regularization Techniques for Few-Shot Neural Radiance Field View Synthesis

Matteo Bonotto, Luigi Sarrocco, Daniele Evangelista, Marco Imperoli, Alberto Pretto

TL;DR

CombiNeRF is a framework that synergically combines several regularization techniques, some of them novel, in order to unify the benefits of each and it is shown that CombiNeRF outperforms the state-of-the-art methods with few-shot settings in several publicly available datasets.

Abstract

Neural Radiance Fields (NeRFs) have shown impressive results for novel view synthesis when a sufficiently large amount of views are available. When dealing with few-shot settings, i.e. with a small set of input views, the training could overfit those views, leading to artifacts and geometric and chromatic inconsistencies in the resulting rendering. Regularization is a valid solution that helps NeRF generalization. On the other hand, each of the most recent NeRF regularization techniques aim to mitigate a specific rendering problem. Starting from this observation, in this paper we propose CombiNeRF, a framework that synergically combines several regularization techniques, some of them novel, in order to unify the benefits of each. In particular, we regularize single and neighboring rays distributions and we add a smoothness term to regularize near geometries. After these geometric approaches, we propose to exploit Lipschitz regularization to both NeRF density and color networks and to use encoding masks for input features regularization. We show that CombiNeRF outperforms the state-of-the-art methods with few-shot settings in several publicly available datasets. We also present an ablation study on the LLFF and NeRF-Synthetic datasets that support the choices made. We release with this paper the open-source implementation of our framework.

CombiNeRF: A Combination of Regularization Techniques for Few-Shot Neural Radiance Field View Synthesis

TL;DR

CombiNeRF is a framework that synergically combines several regularization techniques, some of them novel, in order to unify the benefits of each and it is shown that CombiNeRF outperforms the state-of-the-art methods with few-shot settings in several publicly available datasets.

Abstract

Neural Radiance Fields (NeRFs) have shown impressive results for novel view synthesis when a sufficiently large amount of views are available. When dealing with few-shot settings, i.e. with a small set of input views, the training could overfit those views, leading to artifacts and geometric and chromatic inconsistencies in the resulting rendering. Regularization is a valid solution that helps NeRF generalization. On the other hand, each of the most recent NeRF regularization techniques aim to mitigate a specific rendering problem. Starting from this observation, in this paper we propose CombiNeRF, a framework that synergically combines several regularization techniques, some of them novel, in order to unify the benefits of each. In particular, we regularize single and neighboring rays distributions and we add a smoothness term to regularize near geometries. After these geometric approaches, we propose to exploit Lipschitz regularization to both NeRF density and color networks and to use encoding masks for input features regularization. We show that CombiNeRF outperforms the state-of-the-art methods with few-shot settings in several publicly available datasets. We also present an ablation study on the LLFF and NeRF-Synthetic datasets that support the choices made. We release with this paper the open-source implementation of our framework.
Paper Structure (23 sections, 24 equations, 12 figures, 11 tables)

This paper contains 23 sections, 24 equations, 12 figures, 11 tables.

Figures (12)

  • Figure 1: The figure shows how the proposed CombiNeRF achieves better results in terms of rendering and reconstruction quality in few-shot settings compared with the Vanilla NeRF Instant-NGPtorch-NGP.
  • Figure 2: Overview of the CombiNeRF framework. We sample 3D points over a batch of rays passing through the scene. Position and view direction are respectively encoded through Multi-resolution Hash Encoding and Spherical Harmonics and fed to the Lipschitz network (LipMLP) after being masked. Networks' outputs are used by volumetric rendering for estimating the expected color $C$ and depth $d$ of each ray, while different loss terms are computed to regularize the training process. CombiNeRF combines $i)$ all these regularization losses, $ii)$ the Lipschitz network instead of the original MLP, $iii)$ the Encoding Mask approach used for masking the networks' input.
  • Figure 3: Comparison of our CombiNeRF against RegNeRF, FreeNeRF and Vanilla NeRF on Fern, Horns and Trex scenarios with 3-view setting.
  • Figure 4: In-depth comparison of CombiNeRF against FreeNeRF on some LLFF scenes with 3/6/9 input views.
  • Figure 5: Comparison of CombiNeRF against the Vanilla NeRF method in Drums, Ship, Materials, Ficus and Hotdog scenarios.
  • ...and 7 more figures