An Efficient Projection-Based Next-best-view Planning Framework for Reconstruction of Unknown Objects

Zhizhou Jia; Shaohui Zhang; Qun Hao

An Efficient Projection-Based Next-best-view Planning Framework for Reconstruction of Unknown Objects

Zhizhou Jia, Shaohui Zhang, Qun Hao

TL;DR

This work tackles the computational burden of Next Best View planning for complete 3D object reconstruction by introducing a projection-based NBV framework. It representations unknown object structure as ellipsoids fitted to voxel clusters via Gaussian Mixture Models and Minimum Volume Enclosing Ellipsoids, and evaluates candidate viewpoints through a projection-based quality function that aggregates weighted ellipsoid projections. A global partitioning strategy is employed to prevent backtracking and greedily selected views, enabling more robust coverage. The approach yields up to about 10× efficiency gains in simulation with comparable coverage and is demonstrated to be feasible in real-world experiments with a robotic arm and 3D camera. The contributions advance practical, scalable NBV planning for industrial robotics and quality inspection tasks.

Abstract

Efficiently and completely capturing the three-dimensional data of an object is a fundamental problem in industrial and robotic applications. The task of next-best-view (NBV) planning is to infer the pose of the next viewpoint based on the current data, and gradually realize the complete three-dimensional reconstruction. Many existing algorithms, however, suffer a large computational burden due to the use of ray-casting. To address this, this paper proposes a projection-based NBV planning framework. It can select the next best view at an extremely fast speed while ensuring the complete scanning of the object. Specifically, this framework refits different types of voxel clusters into ellipsoids based on the voxel structure.Then, the next best view is selected from the candidate views using a projection-based viewpoint quality evaluation function in conjunction with a global partitioning strategy. This process replaces the ray-casting in voxel structures, significantly improving the computational efficiency. Comparative experiments with other algorithms in a simulation environment show that the framework proposed in this paper can achieve 10 times efficiency improvement on the basis of capturing roughly the same coverage. The real-world experimental results also prove the efficiency and feasibility of the framework.

An Efficient Projection-Based Next-best-view Planning Framework for Reconstruction of Unknown Objects

TL;DR

Abstract

Paper Structure (16 sections, 11 equations, 12 figures, 2 tables, 1 algorithm)

This paper contains 16 sections, 11 equations, 12 figures, 2 tables, 1 algorithm.

Introduction
Related Works
Methodology
Framework Overview
Proposal of Candidate Viewpoints
Voxel Structure construction
Ellipsoid Representation
Projection-based Viewpoint Quality Evaluation Function
Ellipsoid observation weight calculation
Ellipsoid weighted projection calculation
Viewpoint quality evaluation function
Global partitioning strategy
Experiments
Simulation experiments
Real-world experiments
...and 1 more sections

Figures (12)

Figure 1: Overview of the NBV planning experiment platform. An object to be measured is placed on a turntable, and a 3d camera is equipped at the end of the robotic arm for data acquisition. Our NBV planning framework accomplishes a complete reconstruction of the object by controlling the robotic arm and the turntable.https://drive.google.com/file/d/1yQo_oHEjIK4X6LQc-lpWSFYunienocEl/view?usp=drive_link
Figure 2: Overview of NBV Planning framework proposed in this paper. The orange arrows describe the running steps of the NBV iteration process.
Figure 3: The components of the radius of the candidate viewpoint sampling region and the results of candidate viewpoint sampling within this region.
Figure 4: Use a 2D grid to describe the classification rules of voxels in Octomap. (a) Classification results of one frame input. (b) Classification results of multiple frames input. The green box is the bounding box of the object.
Figure 5: Results of different GMM clustering numbers. $T_o$ represents the number of clusters of Occupied voxels, and $T_f$ represents the number of clusters of Frontier voxels.
...and 7 more figures

An Efficient Projection-Based Next-best-view Planning Framework for Reconstruction of Unknown Objects

TL;DR

Abstract

An Efficient Projection-Based Next-best-view Planning Framework for Reconstruction of Unknown Objects

Authors

TL;DR

Abstract

Table of Contents

Figures (12)