Table of Contents
Fetching ...

Active Implicit Object Reconstruction using Uncertainty-guided Next-Best-View Optimization

Dongyu Yan, Jianheng Liu, Fengyu Quan, Haoyao Chen, Mengmeng Fu

TL;DR

This work tackles active object reconstruction under limited views by unifying an implicit occupancy field with NBV planning. It computes view uncertainty directly from occupancy, and optimizes the next view on a continuous pose manifold by gradient-descent, eliminating the need for a fixed set of candidate views. A top-N uncertainty criterion balances global coverage and local detail, while joint optimization of the object model and sensor poses enables fully autonomous, real-time operation. Demonstrations in both simulation and real-world robotic platforms show superior reconstruction accuracy and efficiency, and the method is released as open source for broader adoption.

Abstract

Actively planning sensor views during object reconstruction is crucial for autonomous mobile robots. An effective method should be able to strike a balance between accuracy and efficiency. In this paper, we propose a seamless integration of the emerging implicit representation with the active reconstruction task. We build an implicit occupancy field as our geometry proxy. While training, the prior object bounding box is utilized as auxiliary information to generate clean and detailed reconstructions. To evaluate view uncertainty, we employ a sampling-based approach that directly extracts entropy from the reconstructed occupancy probability field as our measure of view information gain. This eliminates the need for additional uncertainty maps or learning. Unlike previous methods that compare view uncertainty within a finite set of candidates, we aim to find the next-best-view (NBV) on a continuous manifold. Leveraging the differentiability of the implicit representation, the NBV can be optimized directly by maximizing the view uncertainty using gradient descent. It significantly enhances the method's adaptability to different scenarios. Simulation and real-world experiments demonstrate that our approach effectively improves reconstruction accuracy and efficiency of view planning in active reconstruction tasks. The proposed system will open source at https://github.com/HITSZ-NRSL/ActiveImplicitRecon.git.

Active Implicit Object Reconstruction using Uncertainty-guided Next-Best-View Optimization

TL;DR

This work tackles active object reconstruction under limited views by unifying an implicit occupancy field with NBV planning. It computes view uncertainty directly from occupancy, and optimizes the next view on a continuous pose manifold by gradient-descent, eliminating the need for a fixed set of candidate views. A top-N uncertainty criterion balances global coverage and local detail, while joint optimization of the object model and sensor poses enables fully autonomous, real-time operation. Demonstrations in both simulation and real-world robotic platforms show superior reconstruction accuracy and efficiency, and the method is released as open source for broader adoption.

Abstract

Actively planning sensor views during object reconstruction is crucial for autonomous mobile robots. An effective method should be able to strike a balance between accuracy and efficiency. In this paper, we propose a seamless integration of the emerging implicit representation with the active reconstruction task. We build an implicit occupancy field as our geometry proxy. While training, the prior object bounding box is utilized as auxiliary information to generate clean and detailed reconstructions. To evaluate view uncertainty, we employ a sampling-based approach that directly extracts entropy from the reconstructed occupancy probability field as our measure of view information gain. This eliminates the need for additional uncertainty maps or learning. Unlike previous methods that compare view uncertainty within a finite set of candidates, we aim to find the next-best-view (NBV) on a continuous manifold. Leveraging the differentiability of the implicit representation, the NBV can be optimized directly by maximizing the view uncertainty using gradient descent. It significantly enhances the method's adaptability to different scenarios. Simulation and real-world experiments demonstrate that our approach effectively improves reconstruction accuracy and efficiency of view planning in active reconstruction tasks. The proposed system will open source at https://github.com/HITSZ-NRSL/ActiveImplicitRecon.git.
Paper Structure (23 sections, 14 equations, 11 figures, 2 tables)

This paper contains 23 sections, 14 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: The architecture of our method. We construct our implicit occupancy field from volume-rendered color and depth supervision and additional free-ray supervision. Object bounding box prior is utilized to assist sampling and model range. We evaluate view uncertainty directly from occupancy probability using a sampling-based approach. The NBV pose is iteratively optimized by maximizing the information gain through back propagation and is used to guide the robot movement for next cycle reconstruction.
  • Figure 2: Illustration of ray types defined in our reconstruction method. The sampled rays are classified into invalid rays, valid rays and free rays.
  • Figure 3: Illustration of the unfair evaluation in the dragon's scene. The bottom images show its uncertainty color map of different views after the $4^{th}$ reconstruction round. Although the front view (the right image) has already been well observed, the uncertainty sum is still significantly higher than that of the side view (the left image), which is not well reconstructed.
  • Figure 4: Reconstructed model and surface coverage curve of Stanford Bunny, Dragon, and Armadillo. We compare our method against uncertainty policies of Occlusion-Aware, Average-Energy used in isler2016information, and Random View on reconstruction quality and surface coverage. Our mesh extraction of the implicit surface has finer geometry and smoother surface than voxel representations. The surface coverage representing the reconstruction efficiency also converges faster.
  • Figure 5: Qualitative comparison results of Lego, Car, and House. It is challenging to reconstruct complex, realistic objects using only ten views. Our method can detect uncertain parts of the object and assign views for optimal observation. It removes artifacts generated in fixed view settings and generates better results against the candidate view method. The implicit reconstruction pipeline also builds finer geometry than traditional TSDF fusion methods.
  • ...and 6 more figures