Table of Contents
Fetching ...

ScanDP: Generalizable 3D Scanning with Diffusion Policy

Itsuki Hirako, Ryo Hakoda, Yubin Liu, Matthew Hwang, Yoshihiro Sato, Takeshi Oishi

TL;DR

A data-efficient 3D scanning framework that uses Diffusion Policy to imitate human-like scanning strategies, and adopts the Occupancy Grid Mapping instead of direct point cloud processing, offering improved noise resilience and handling of diverse object geometries.

Abstract

Learning-based 3D Scanning plays a crucial role in enabling efficient and accurate scanning of target objects. However, recent reinforcement learning-based methods often require large-scale training data and still struggle to generalize to unseen object categories.In this work, we propose a data-efficient 3D scanning framework that uses Diffusion Policy to imitate human-like scanning strategies. To enhance robustness and generalization, we adopt the Occupancy Grid Mapping instead of direct point cloud processing, offering improved noise resilience and handling of diverse object geometries. We also introduce a hybrid approach combining a sphere-based space representation with a path optimization procedure that ensures path safety and scanning efficiency. This approach addresses limitations in conventional imitation learning, such as redundant or unpredictable behavior. We evaluate our method on diverse unseen objects in both shape and scale. Ours achieves higher coverage and shorter paths than baselines, while remaining robust to sensor noise. We further confirm practical feasibility and stable operation in real-world execution.

ScanDP: Generalizable 3D Scanning with Diffusion Policy

TL;DR

A data-efficient 3D scanning framework that uses Diffusion Policy to imitate human-like scanning strategies, and adopts the Occupancy Grid Mapping instead of direct point cloud processing, offering improved noise resilience and handling of diverse object geometries.

Abstract

Learning-based 3D Scanning plays a crucial role in enabling efficient and accurate scanning of target objects. However, recent reinforcement learning-based methods often require large-scale training data and still struggle to generalize to unseen object categories.In this work, we propose a data-efficient 3D scanning framework that uses Diffusion Policy to imitate human-like scanning strategies. To enhance robustness and generalization, we adopt the Occupancy Grid Mapping instead of direct point cloud processing, offering improved noise resilience and handling of diverse object geometries. We also introduce a hybrid approach combining a sphere-based space representation with a path optimization procedure that ensures path safety and scanning efficiency. This approach addresses limitations in conventional imitation learning, such as redundant or unpredictable behavior. We evaluate our method on diverse unseen objects in both shape and scale. Ours achieves higher coverage and shorter paths than baselines, while remaining robust to sensor noise. We further confirm practical feasibility and stable operation in real-world execution.
Paper Structure (16 sections, 9 equations, 7 figures, 5 tables)

This paper contains 16 sections, 9 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Generalizable 3D Scanning Policy. Our proposed framework ScanDP can generalize to unseen objects with a small amount of training data (Right). Prior works lack generalization capability and are not robust to changes in conditions (left).
  • Figure 2: Method Overview. ScanDP consists of two main components: Path Generation (Upper part) and Path Optimization (Lower part). In the Path Generation phase, actions are generated from a diffusion policy conditioned on an occupancy grid map (OGM) and the current camera pose. In the Path Optimization phase, the generated actions are refined through a bubble-based collision filter and viewpoint extraction to optimize the scanning trajectory.
  • Figure 3: Path length comparison (Scale$\times$1.0)
  • Figure 4: Path length comparison (Scale$\times$1.5)
  • Figure 5: Visualization of movement paths from DP3 and ScanDP. ScanDP with path optimization attains the shortest path length and smoothest movement. We find that DP3 tends to get stuck in a particular location when scanning objects not seen during training.
  • ...and 2 more figures