Table of Contents
Fetching ...

OracleGS: Grounding Generative Priors for Sparse-View Gaussian Splatting

Atakan Topaloglu, Kunyi Li, Michael Niemeyer, Nassir Navab, A. Murat Tekalp, Federico Tombari

TL;DR

Sparse-view novel view synthesis is ill-posed due to geometric ambiguity between regressive fidelity and generative completeness. OracleGS merges a 3D-aware diffusion model to propose complete views with an MVS-based 3D oracle that provides per-pixel uncertainty from attention maps, which in turn guides an uncertainty-weighted 3D Gaussian Splatting optimization. The approach introduces a novel uncertainty-weighted synthetic loss and a progressive augmentation curriculum, achieving state-of-the-art results on Mip-NeRF 360 and NeRF Synthetic, especially in extremely sparse settings. This framework grounds powerful generative priors in multi-view geometric evidence, reducing hallucinations while preserving plausible completions, with practical implications for scalable 3D content in sparse-shot scenarios.

Abstract

Sparse-view novel view synthesis is fundamentally ill-posed due to severe geometric ambiguity. Current methods are caught in a trade-off: regressive models are geometrically faithful but incomplete, whereas generative models can complete scenes but often introduce structural inconsistencies. We propose OracleGS, a novel framework that reconciles generative completeness with regressive fidelity for sparse view Gaussian Splatting. Instead of using generative models to patch incomplete reconstructions, our "propose-and-validate" framework first leverages a pre-trained 3D-aware diffusion model to synthesize novel views to propose a complete scene. We then repurpose a multi-view stereo (MVS) model as a 3D-aware oracle to validate the 3D uncertainties of generated views, using its attention maps to reveal regions where the generated views are well-supported by multi-view evidence versus where they fall into regions of high uncertainty due to occlusion, lack of texture, or direct inconsistency. This uncertainty signal directly guides the optimization of a 3D Gaussian Splatting model via an uncertainty-weighted loss. Our approach conditions the powerful generative prior on multi-view geometric evidence, filtering hallucinatory artifacts while preserving plausible completions in under-constrained regions, outperforming state-of-the-art methods on datasets including Mip-NeRF 360 and NeRF Synthetic.

OracleGS: Grounding Generative Priors for Sparse-View Gaussian Splatting

TL;DR

Sparse-view novel view synthesis is ill-posed due to geometric ambiguity between regressive fidelity and generative completeness. OracleGS merges a 3D-aware diffusion model to propose complete views with an MVS-based 3D oracle that provides per-pixel uncertainty from attention maps, which in turn guides an uncertainty-weighted 3D Gaussian Splatting optimization. The approach introduces a novel uncertainty-weighted synthetic loss and a progressive augmentation curriculum, achieving state-of-the-art results on Mip-NeRF 360 and NeRF Synthetic, especially in extremely sparse settings. This framework grounds powerful generative priors in multi-view geometric evidence, reducing hallucinations while preserving plausible completions, with practical implications for scalable 3D content in sparse-shot scenarios.

Abstract

Sparse-view novel view synthesis is fundamentally ill-posed due to severe geometric ambiguity. Current methods are caught in a trade-off: regressive models are geometrically faithful but incomplete, whereas generative models can complete scenes but often introduce structural inconsistencies. We propose OracleGS, a novel framework that reconciles generative completeness with regressive fidelity for sparse view Gaussian Splatting. Instead of using generative models to patch incomplete reconstructions, our "propose-and-validate" framework first leverages a pre-trained 3D-aware diffusion model to synthesize novel views to propose a complete scene. We then repurpose a multi-view stereo (MVS) model as a 3D-aware oracle to validate the 3D uncertainties of generated views, using its attention maps to reveal regions where the generated views are well-supported by multi-view evidence versus where they fall into regions of high uncertainty due to occlusion, lack of texture, or direct inconsistency. This uncertainty signal directly guides the optimization of a 3D Gaussian Splatting model via an uncertainty-weighted loss. Our approach conditions the powerful generative prior on multi-view geometric evidence, filtering hallucinatory artifacts while preserving plausible completions in under-constrained regions, outperforming state-of-the-art methods on datasets including Mip-NeRF 360 and NeRF Synthetic.

Paper Structure

This paper contains 23 sections, 7 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: OracleGS Reconciles Generative Completeness with Regressive Fidelity. First, a generative model proposes a potentially flawed, synthetic view (middle left). Our MVS-based oracle then grounds this proposal by quantifying a 3D uncertainty map (middle), effectively identifying various sources of generative errors; including faulty textures on the mat, inconsistent structures on the lego, and under-observed backgrounds, as regions of high uncertainty (highlighted in blue). Using this signal to guide a confidence-weighted optimization, OracleGS filters these artifacts, producing novel views (middle right) with superior fidelity compared to both the synthetic input and prior state-of-the-art DropGaussian park2025dropgaussian (left).
  • Figure 2: Overview. Given sparse input views with poses, we first estimate initial point cloud and depth maps. Afterwards, a 3D-Aware generative model proposes novel synthetic views, while the 3D-Aware Oracle's attention maps are used as a proxy for 3D uncertainty. Finally, we train the 3DGS model using a standard loss on the GT views and our novel uncertainty-guided loss on the synthetic views. We employ a progressive augmentation strategy over the course of the optimization to control the ratio of GT and synthetic images at each iteration, which helps to stabilize training and guide scene structure.
  • Figure 3: Visual comparison with state-of-the-art methods on the Mip-NeRF360 barron2022mipnerf360 dataset. Our method, OracleGS, demonstrates superior handling of common failure modes. Top row (Bonsai, 12 views): OracleGS accurately reconstructs the challenging carpet texture and background regions while competing methods produce noticeable artifacts. Middle row (Room, 12 views): Our method avoids the distortions present in other reconstructions. Bottom row (Bicycle, 24 views): Our approach strikes a balance between detail and smoothness, preventing the noisy overfitting seen in CoR-GS zhang2024cor and the oversmoothing that erases fine details in DropGaussian park2025dropgaussian.
  • Figure 4: Visual comparison with state-of-the-art methods on the NeRF Synthetic mildenhall2020nerf dataset. Our method consistently produces higher-fidelity reconstructions across diverse and challenging scenes compared to prior work. Top row (Hotdog): OracleGS eliminates the "floater" artifacts present in competing methods, preserving details on the condiments and plate. Middle row (Ficus): Our method reconstructs the intricate vase structures where competing methods fail. Bottom row (Drums): Our method reconstructs the thin structures of the drum kit's stands and cymbals, which are fragmented or missing in other reconstructions.
  • Figure 5: Our 3D-aware Oracle quantifies diverse sources of 3D uncertainty. We visualize extracted uncertainty maps (middle row) on synthetic images from the generative model using global-attention layers from the repurposed MVS model wang2025vggtnormalized from onetozerowhere low uncertainty is shown in yellow and high uncertainty is shown in purple. Each column demonstrates the oracle's ability to identify a specific failure mode in the synthetic proposals by comparing against the ground truth.
  • ...and 1 more figures