Table of Contents
Fetching ...

Active3D: Active High-Fidelity 3D Reconstruction via Hierarchical Uncertainty Quantification

Yan Li, Yingzhao Li, Gim Hee Lee

TL;DR

Active3D addresses high-fidelity active 3D reconstruction by unifying implicit neural fields with explicit Gaussian primitives into a hybrid scene state. It builds a hierarchical uncertainty map that fuses global structural priors, local surface confidence, and temporal consistency to drive an EHIG-based next-best-view planner and a risk-aware trajectory. A keyframe strategy and a viewpoint-space sliding window enable scalable, uncertainty-aware refinement that preserves global–local consistency. Experimental results on Replica and MP3D demonstrate state-of-the-art accuracy, completeness, and rendering quality, highlighting robustness to occlusion and complex geometries in real-world robotic perception tasks.

Abstract

In this paper, we present an active exploration framework for high-fidelity 3D reconstruction that incrementally builds a multi-level uncertainty space and selects next-best-views through an uncertainty-driven motion planner. We introduce a hybrid implicit-explicit representation that fuses neural fields with Gaussian primitives to jointly capture global structural priors and locally observed details. Based on this hybrid state, we derive a hierarchical uncertainty volume that quantifies both implicit global structure quality and explicit local surface confidence. To focus optimization on the most informative regions, we propose an uncertainty-driven keyframe selection strategy that anchors high-entropy viewpoints as sparse attention nodes, coupled with a viewpoint-space sliding window for uncertainty-aware local refinement. The planning module formulates next-best-view selection as an Expected Hybrid Information Gain problem and incorporates a risk-sensitive path planner to ensure efficient and safe exploration. Extensive experiments on challenging benchmarks demonstrate that our approach consistently achieves state-of-the-art accuracy, completeness, and rendering quality, highlighting its effectiveness for real-world active reconstruction and robotic perception tasks.

Active3D: Active High-Fidelity 3D Reconstruction via Hierarchical Uncertainty Quantification

TL;DR

Active3D addresses high-fidelity active 3D reconstruction by unifying implicit neural fields with explicit Gaussian primitives into a hybrid scene state. It builds a hierarchical uncertainty map that fuses global structural priors, local surface confidence, and temporal consistency to drive an EHIG-based next-best-view planner and a risk-aware trajectory. A keyframe strategy and a viewpoint-space sliding window enable scalable, uncertainty-aware refinement that preserves global–local consistency. Experimental results on Replica and MP3D demonstrate state-of-the-art accuracy, completeness, and rendering quality, highlighting robustness to occlusion and complex geometries in real-world robotic perception tasks.

Abstract

In this paper, we present an active exploration framework for high-fidelity 3D reconstruction that incrementally builds a multi-level uncertainty space and selects next-best-views through an uncertainty-driven motion planner. We introduce a hybrid implicit-explicit representation that fuses neural fields with Gaussian primitives to jointly capture global structural priors and locally observed details. Based on this hybrid state, we derive a hierarchical uncertainty volume that quantifies both implicit global structure quality and explicit local surface confidence. To focus optimization on the most informative regions, we propose an uncertainty-driven keyframe selection strategy that anchors high-entropy viewpoints as sparse attention nodes, coupled with a viewpoint-space sliding window for uncertainty-aware local refinement. The planning module formulates next-best-view selection as an Expected Hybrid Information Gain problem and incorporates a risk-sensitive path planner to ensure efficient and safe exploration. Extensive experiments on challenging benchmarks demonstrate that our approach consistently achieves state-of-the-art accuracy, completeness, and rendering quality, highlighting its effectiveness for real-world active reconstruction and robotic perception tasks.

Paper Structure

This paper contains 40 sections, 25 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Performance on the Replica dataset. Left: Comparison of rendering quality (PSNR) versus reconstruction completeness (C.R.) across state-of-the-art methods. Right: Qualitative outputs of our method including reconstructed mesh, Gaussian map, and estimated depth.
  • Figure 2: Our method processes the RGB-D stream through dual explicit and implicit reconstruction branches. The explicit branch projects data into a 3D Gaussian model, while the implicit branch employs an encoder-decoder architecture to regress RGB values and SDF. Subsequently, the discrepancy between the rendered RGB-D and the GT RGB-D is computed. Another mlp predicts global uncertainty, while temporal variations on the SDF surface are characterized to derive uncertainty for the hybrid explicit-implicit representation. This representation then drives NBV selection and path planning. Finally, keyframes are selected within a sliding window for joint optimization of the explicit and implicit maps.
  • Figure 3: Qualitative comparison of 3D reconstruction results on representative MP3D sequences. Additional results and detailed comparisons for all Replica and MP3D sequences are provided in the supplementary material.
  • Figure 4: Visualization of uncertainties and their spatial relationship to real scene. Our proposed hybrid strategy not only endows the agent with global optimization capabilities, but also enables it to perceive intricate structures and textures while handling occlusions.
  • Figure 5: Novel view synthesis results on the MP3D dataset. The tested viewpoints were not present in any training trajectories of the evaluated methods. PSNR values are indicated in the top-left corner. Challenging regions are highlighted with red boxes.
  • ...and 7 more figures