DM-OSVP++: One-Shot View Planning Using 3D Diffusion Models for Active RGB-Based Object Reconstruction
Sicong Pan, Liren Jin, Xuying Huang, Cyrill Stachniss, Marija Popović, Maren Bennewitz
TL;DR
DM-OSVP++ addresses RGB-based active object reconstruction by introducing one-shot view planning guided by priors from a 3D diffusion model. It uses EscherNet to generate a proxy mesh from a few reference views, then casts the planning problem as a customized set-covering optimization that accounts for geometric and textural complexity through PFHRGB-based entropy and multi-view constraints. The approach yields a globally shortest viewing path over an object-centric view space, enabling efficient data collection, and demonstrates compatibility with multiple RGB-based reconstruction backends (e.g., Instant-NGP, NeuS2, 2DGS). Real-world experiments with a UR5 robot show dynamic view-space adaptation and robust reconstruction under practical constraints, highlighting the method’s applicability to diverse objects and environments. Overall, DM-OSVP++ achieves a favorable trade-off between viewpoint efficiency and reconstruction quality by leveraging diffusion priors and a principled, object-specific planning framework.
Abstract
Active object reconstruction is crucial for many robotic applications. A key aspect in these scenarios is generating object-specific view configurations to obtain informative measurements for reconstruction. One-shot view planning enables efficient data collection by predicting all views at once, eliminating the need for time-consuming online replanning. Our primary insight is to leverage the generative power of 3D diffusion models as valuable prior information. By conditioning on initial multi-view images, we exploit the priors from the 3D diffusion model to generate an approximate object model, serving as the foundation for our view planning. Our novel approach integrates the geometric and textural distributions of the object model into the view planning process, generating views that focus on the complex parts of the object to be reconstructed. We validate the proposed active object reconstruction system through both simulation and real-world experiments, demonstrating the effectiveness of using 3D diffusion priors for one-shot view planning.
