CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency

Hanxin Zhu; Tianyu He; Zhibo Chen

CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency

Hanxin Zhu, Tianyu He, Zhibo Chen

TL;DR

This work tackles the problem of NeRF-based few-shot novel view synthesis, where scant input views cause overfitting and poor depth estimation. It introduces Cross-view Multiplane Consistency (CMC), which builds per-view Multiplane Images (MPI) and enforces depth-aware consistency by sharing sampling points across views, supplemented by reconstruction loss on seen views and appearance/depth losses on unseen views. The approach, including per-view MPI, weighted rendering, and cross-view losses, achieves state-of-the-art results on LLFF and Shiny datasets without requiring scene priors or complex priors, improving both visual quality and geometry continuity. By enabling robust, cross-view geometry learning in sparse-view regimes, CMC offers practical gains for real-world view synthesis in VR/AR and related applications.

Abstract

Neural Radiance Field (NeRF) has shown impressive results in novel view synthesis, particularly in Virtual Reality (VR) and Augmented Reality (AR), thanks to its ability to represent scenes continuously. However, when just a few input view images are available, NeRF tends to overfit the given views and thus make the estimated depths of pixels share almost the same value. Unlike previous methods that conduct regularization by introducing complex priors or additional supervisions, we propose a simple yet effective method that explicitly builds depth-aware consistency across input views to tackle this challenge. Our key insight is that by forcing the same spatial points to be sampled repeatedly in different input views, we are able to strengthen the interactions between views and therefore alleviate the overfitting problem. To achieve this, we build the neural networks on layered representations (\textit{i.e.}, multiplane images), and the sampling point can thus be resampled on multiple discrete planes. Furthermore, to regularize the unseen target views, we constrain the rendered colors and depths from different input views to be the same. Although simple, extensive experiments demonstrate that our proposed method can achieve better synthesis quality over state-of-the-art methods.

CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency

TL;DR

Abstract

Paper Structure (33 sections, 24 equations, 4 figures, 3 tables)

This paper contains 33 sections, 24 equations, 4 figures, 3 tables.

Introduction
Related Work
Novel View Synthesis
Few-shot NeRF
Multiplane Images
Preliminaries
Neural Radiance Field
Multiplane Images
Cross-view Multiplane Consistency
Motivation.
Method Overview.
Multiplane Representation for Input Views
Cross-view Consistency on Multplanes
Reconstruction Loss for Input Views.
Appearance and Depth Consistency Loss for Unseen Views.
...and 18 more sections

Figures (4)

Figure 1: Given a few input views (e.g., 3 input views), (a) NeRF tends to overfit to input views and results in a dramatic performance drop, where the estimated depths of pixels share almost the same value. (b) Our key insight is to ensure the same spatial points can be sampled repeatedly in different input views. (c) Our proposed method can achieve smooth depth estimation by introducing cross-view multiplane consistency, resulting in better synthesis quality.
Figure 2: Qualitative comparisons on the Shiny dataset, where our proposed method can achieve better novel view synthesis and accurate geometry estimation (i.e., the depth map).
Figure 3: Qualitative comparisons on the LLFF dataset. Our proposed method can avoid the overfitting problem, where better novel view synthesis and more continuous depth estimation can be achieved.
Figure 4: Qualitative comparisons of different choices of loss functions. (1) Single MPI with $\mathcal{L}_{\text{MSE}}$. (2) Per-view MPI with $\mathcal{L}_{\text{MSE}}$. (3) Per-view MPI with $\mathcal{L}_{\text{MSE}}+\mathcal{L}_{\text{dc}}^{\text{I}}$. (4) Per-view MPI with $\mathcal{L}_{\text{MSE}}+\mathcal{L}_{\text{dc}}^{\text{I}}+\mathcal{L}_{\text{ac}}$. (5) Per-view MPI with $\mathcal{L}_{\text{MSE}}+\mathcal{L}_{\text{dc}}^{\text{I}}+\mathcal{L}_{\text{ac}}+\mathcal{L}_{\text{dc}}$.

CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency

TL;DR

Abstract

CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency

Authors

TL;DR

Abstract

Table of Contents

Figures (4)