Table of Contents
Fetching ...

3D-CovDiffusion: 3D-Aware Diffusion Policy for Coverage Path Planning

Chenyuan Chen, Haoran Ding, Ran Ding, Tianyu Liu, Zewen He, Anqing Duan, Dezhen Song, Xiaodan Liang, Yoshihiko Nakamura

TL;DR

The paper tackles the challenge of generating long, smooth 3D spray trajectories that robustly cover complex surfaces across diverse geometries. It introduces 3DCovDiffusion, an end-to-end diffusion policy conditioned on both 3D point clouds and prior motion, capable of producing spatially coherent 6-DoF trajectories in a single pass without segment-wise stitching. The approach leverages a geometry-and-motion history encoder, a conditional DDIM framework with FiLM-based conditioning, and an object-aligned coverage objective to maximize surface coverage while ensuring smoothness. Empirical results on extended PaintNet datasets show substantial improvements in Point-wise Chamfer Distance, Coverage, and Smoothness over strong baselines, with notable gains in generalization across object categories; limitations arise from data scarcity in some categories, particularly containers. Overall, the method offers a scalable, category-agnostic pathway toward end-to-end trajectory learning for industrial surface-processing tasks with practical implications for production-line deployment.

Abstract

Diffusion models, as a class of deep generative models, have recently emerged as powerful tools for robot skills by enabling stable training with reliable convergence. In this paper, we present an end-to-end framework for generating long, smooth trajectories that explicitly target high surface coverage across various industrial tasks, including polishing, robotic painting, and spray coating. The conventional methods are always fundamentally constrained by their predefined functional forms, which limit the shapes of the trajectories they can represent and make it difficult to handle complex and diverse tasks. Moreover, their generalization is poor, often requiring manual redesign or extensive parameter tuning when applied to new scenarios. These limitations highlight the need for more expressive generative models, making diffusion-based approaches a compelling choice for trajectory generation. By iteratively denoising trajectories with carefully learned noise schedules and conditioning mechanisms, diffusion models not only ensure smooth and consistent motion but also flexibly adapt to the task context. In experiments, our method improves trajectory continuity, maintains high coverage, and generalizes to unseen shapes, paving the way for unified end-to-end trajectory learning across industrial surface-processing tasks without category-specific models. On average, our approach improves Point-wise Chamfer Distance by 98.2\% and smoothness by 97.0\%, while increasing surface coverage by 61\% compared to prior methods. The link to our code can be found \href{https://anonymous.4open.science/r/spraydiffusion_ral-2FCE/README.md}{here}.

3D-CovDiffusion: 3D-Aware Diffusion Policy for Coverage Path Planning

TL;DR

The paper tackles the challenge of generating long, smooth 3D spray trajectories that robustly cover complex surfaces across diverse geometries. It introduces 3DCovDiffusion, an end-to-end diffusion policy conditioned on both 3D point clouds and prior motion, capable of producing spatially coherent 6-DoF trajectories in a single pass without segment-wise stitching. The approach leverages a geometry-and-motion history encoder, a conditional DDIM framework with FiLM-based conditioning, and an object-aligned coverage objective to maximize surface coverage while ensuring smoothness. Empirical results on extended PaintNet datasets show substantial improvements in Point-wise Chamfer Distance, Coverage, and Smoothness over strong baselines, with notable gains in generalization across object categories; limitations arise from data scarcity in some categories, particularly containers. Overall, the method offers a scalable, category-agnostic pathway toward end-to-end trajectory learning for industrial surface-processing tasks with practical implications for production-line deployment.

Abstract

Diffusion models, as a class of deep generative models, have recently emerged as powerful tools for robot skills by enabling stable training with reliable convergence. In this paper, we present an end-to-end framework for generating long, smooth trajectories that explicitly target high surface coverage across various industrial tasks, including polishing, robotic painting, and spray coating. The conventional methods are always fundamentally constrained by their predefined functional forms, which limit the shapes of the trajectories they can represent and make it difficult to handle complex and diverse tasks. Moreover, their generalization is poor, often requiring manual redesign or extensive parameter tuning when applied to new scenarios. These limitations highlight the need for more expressive generative models, making diffusion-based approaches a compelling choice for trajectory generation. By iteratively denoising trajectories with carefully learned noise schedules and conditioning mechanisms, diffusion models not only ensure smooth and consistent motion but also flexibly adapt to the task context. In experiments, our method improves trajectory continuity, maintains high coverage, and generalizes to unseen shapes, paving the way for unified end-to-end trajectory learning across industrial surface-processing tasks without category-specific models. On average, our approach improves Point-wise Chamfer Distance by 98.2\% and smoothness by 97.0\%, while increasing surface coverage by 61\% compared to prior methods. The link to our code can be found \href{https://anonymous.4open.science/r/spraydiffusion_ral-2FCE/README.md}{here}.

Paper Structure

This paper contains 27 sections, 15 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of the 3DCovDiffusion framework: the policy learns from object geometry and demonstration trajectories, using point clouds as conditions and demonstrations as supervision. At inference, it generates ordered, single-pass trajectories conditioned only on the point cloud and prior predictions. As shown on the right, 3DCovDiffusion (green box) achieves higher surface coverage than the current method baseline (black box) by producing complete, temporally ordered trajectories in a single pass, while the outputs from current method unordered points requiring post-hoc sorting. As a single end-to-end model, 3DCovDiffusion exhibits stronger generalization across diverse object geometries.
  • Figure 2: Illustration of the 3DCovDiffusion architecture. First, input point clouds are passed through the geometry encoder, which extracts a global observation feature. Simultaneously, the robot state is encoded to produce a state feature. These two features are combined to form the global condition for trajectory generation. Next, a diffusion model samples a noisy trajectory sequence from a Gaussian prior and iteratively denoises it into a noise-free trajectory conditioned on the global features. Finally, the noise-free segments are concatenated to form a complete trajectory.
  • Figure 3: Qualitative comparison of trajectory predictions for four object categories. Grey: Point Cloud, Red: ground-truth(GT), Blue: 3DCovDiffusion (Ours).
  • Figure 4: Qualitative coverage comparison across object categories. Columns (left to right) show PaintNet, and 3DCovDiffusion (Ours); rows correspond to Windows, Cuboids, Shelves, and Containers. Each cell presents multiple representative viewpoints with surface coverage visualization: yellow regions indicate covered/painted surfaces, while gray regions indicate uncovered surfaces. This facilitates visual comparison of coverage completeness and consistency across methods.