Table of Contents
Fetching ...

MaskPlanner: Learning-Based Object-Centric Motion Generation from 3D Point Clouds

Gabriele Tiboni, Raffaello Camoriano, Tatiana Tommasi

TL;DR

This paper tackles Object-Centric Motion Generation (OCMG) by learning from 3D point clouds to produce long-horizon, unstructured paths without relying on task-specific heuristics. It introduces MaskPlanner, which jointly predicts local path segments and path masks in a single forward pass, followed by postprocessing to assemble executable paths. The Extended PaintNet dataset is expanded to 3088 samples to support robust evaluation, and MaskPlanner achieves near-complete surface coverage on unseen objects in simulation and expert-level painting quality in real-world validation on a 6-DoF robot. The results demonstrate strong generalization, real-time inference, and potential applicability to a broad range of object-centric tasks beyond spray painting, paving the way for scalable, data-driven OCMG solutions.

Abstract

Object-Centric Motion Generation (OCMG) plays a key role in a variety of industrial applications$\unicode{x2014}$such as robotic spray painting and welding$\unicode{x2014}$requiring efficient, scalable, and generalizable algorithms to plan multiple long-horizon trajectories over free-form 3D objects. However, existing solutions rely on specialized heuristics, expensive optimization routines, or restrictive geometry assumptions that limit their adaptability to real-world scenarios. In this work, we introduce a novel, fully data-driven framework that tackles OCMG directly from 3D point clouds, learning to generalize expert path patterns across free-form surfaces. We propose MaskPlanner, a deep learning method that predicts local path segments for a given object while simultaneously inferring "path masks" to group these segments into distinct paths. This design induces the network to capture both local geometric patterns and global task requirements in a single forward pass. Extensive experimentation on a realistic robotic spray painting scenario shows that our approach attains near-complete coverage (above 99%) for unseen objects, while it remains task-agnostic and does not explicitly optimize for paint deposition. Moreover, our real-world validation on a 6-DoF specialized painting robot demonstrates that the generated trajectories are directly executable and yield expert-level painting quality. Our findings crucially highlight the potential of the proposed learning method for OCMG to reduce engineering overhead and seamlessly adapt to several industrial use cases.

MaskPlanner: Learning-Based Object-Centric Motion Generation from 3D Point Clouds

TL;DR

This paper tackles Object-Centric Motion Generation (OCMG) by learning from 3D point clouds to produce long-horizon, unstructured paths without relying on task-specific heuristics. It introduces MaskPlanner, which jointly predicts local path segments and path masks in a single forward pass, followed by postprocessing to assemble executable paths. The Extended PaintNet dataset is expanded to 3088 samples to support robust evaluation, and MaskPlanner achieves near-complete surface coverage on unseen objects in simulation and expert-level painting quality in real-world validation on a 6-DoF robot. The results demonstrate strong generalization, real-time inference, and potential applicability to a broad range of object-centric tasks beyond spray painting, paving the way for scalable, data-driven OCMG solutions.

Abstract

Object-Centric Motion Generation (OCMG) plays a key role in a variety of industrial applicationssuch as robotic spray painting and weldingrequiring efficient, scalable, and generalizable algorithms to plan multiple long-horizon trajectories over free-form 3D objects. However, existing solutions rely on specialized heuristics, expensive optimization routines, or restrictive geometry assumptions that limit their adaptability to real-world scenarios. In this work, we introduce a novel, fully data-driven framework that tackles OCMG directly from 3D point clouds, learning to generalize expert path patterns across free-form surfaces. We propose MaskPlanner, a deep learning method that predicts local path segments for a given object while simultaneously inferring "path masks" to group these segments into distinct paths. This design induces the network to capture both local geometric patterns and global task requirements in a single forward pass. Extensive experimentation on a realistic robotic spray painting scenario shows that our approach attains near-complete coverage (above 99%) for unseen objects, while it remains task-agnostic and does not explicitly optimize for paint deposition. Moreover, our real-world validation on a 6-DoF specialized painting robot demonstrates that the generated trajectories are directly executable and yield expert-level painting quality. Our findings crucially highlight the potential of the proposed learning method for OCMG to reduce engineering overhead and seamlessly adapt to several industrial use cases.

Paper Structure

This paper contains 24 sections, 7 equations, 18 figures, 4 tables.

Figures (18)

  • Figure 1: Several object-centric robotic applications may be unified under a single problem formulation, as they share common assumptions on the desired output paths---referred to as unstructured paths.
  • Figure 2: Schematic illustration of a data sample $(\mathbf{O}, \mathbf{Y})$ describing input and output of the OCMG problem. The output paths are unstructured as they vary in number and length depending on the input object and can be executed in arbitrary order.
  • Figure 3: Example of two arbitrary configurations of ground truth paths $\mathbf{Y}$ on an L-shaped 2D object. Notice how MaskPlanner can easily manage both cases by breaking down the learning problem into the prediction of path-agnostic segments and their associated path masks.
  • Figure 4: Overview of the training pipeline of our method (MaskPlanner). Global features are learned from a point cloud representation of the input object, and used to concurrently predict path segments and path masks, in a single forward pass.
  • Figure 5: Illustration of our Asymmetric Point-to-Segment curriculum for segment predictions ($\lambda{=}3$). The parameters $w^b_p,w^b_s$ weighting the backward point-wise and segment-wise ACD terms vary during training.
  • ...and 13 more figures