Table of Contents
Fetching ...

MATCH POLICY: A Simple Pipeline from Point Cloud Registration to Manipulation Policies

Haojie Huang, Haotian Liu, Dian Wang, Robin Walters, Robert Platt

TL;DR

MATCH POLICY reframes robotic manipulation as a point-cloud registration problem, transferring policy inference from action prediction to registration of pick/place targets against demonstrations. It stores combined point clouds derived from demonstrations and uses optimization-based registration (RANSAC plus colored ICP) to infer multi-step, open-loop actions via relative transforms, achieving high-precision performance with minimal demonstrations. The method leverages equivariant and bi-equivariant properties to enhance sample efficiency and generalization across unseen configurations, cameras, and tasks, with strong results on RLBench benchmarks and successful real-robot deployments. Practically, this approach offers a training-free, plug-and-play tool for industrial-style pick-and-place, capable of handling long horizons and articulated objects, while highlighting limitations related to segmentation requirements and open-loop execution.

Abstract

Many manipulation tasks require the robot to rearrange objects relative to one another. Such tasks can be described as a sequence of relative poses between parts of a set of rigid bodies. In this work, we propose MATCH POLICY, a simple but novel pipeline for solving high-precision pick and place tasks. Instead of predicting actions directly, our method registers the pick and place targets to the stored demonstrations. This transfers action inference into a point cloud registration task and enables us to realize nontrivial manipulation policies without any training. MATCH POLICY is designed to solve high-precision tasks with a key-frame setting. By leveraging the geometric interaction and the symmetries of the task, it achieves extremely high sample efficiency and generalizability to unseen configurations. We demonstrate its state-of-the-art performance across various tasks on RLBench benchmark compared with several strong baselines and test it on a real robot with six tasks.

MATCH POLICY: A Simple Pipeline from Point Cloud Registration to Manipulation Policies

TL;DR

MATCH POLICY reframes robotic manipulation as a point-cloud registration problem, transferring policy inference from action prediction to registration of pick/place targets against demonstrations. It stores combined point clouds derived from demonstrations and uses optimization-based registration (RANSAC plus colored ICP) to infer multi-step, open-loop actions via relative transforms, achieving high-precision performance with minimal demonstrations. The method leverages equivariant and bi-equivariant properties to enhance sample efficiency and generalization across unseen configurations, cameras, and tasks, with strong results on RLBench benchmarks and successful real-robot deployments. Practically, this approach offers a training-free, plug-and-play tool for industrial-style pick-and-place, capable of handling long horizons and articulated objects, while highlighting limitations related to segmentation requirements and open-loop execution.

Abstract

Many manipulation tasks require the robot to rearrange objects relative to one another. Such tasks can be described as a sequence of relative poses between parts of a set of rigid bodies. In this work, we propose MATCH POLICY, a simple but novel pipeline for solving high-precision pick and place tasks. Instead of predicting actions directly, our method registers the pick and place targets to the stored demonstrations. This transfers action inference into a point cloud registration task and enables us to realize nontrivial manipulation policies without any training. MATCH POLICY is designed to solve high-precision tasks with a key-frame setting. By leveraging the geometric interaction and the symmetries of the task, it achieves extremely high sample efficiency and generalizability to unseen configurations. We demonstrate its state-of-the-art performance across various tasks on RLBench benchmark compared with several strong baselines and test it on a real robot with six tasks.
Paper Structure (15 sections, 2 theorems, 1 equation, 5 figures, 5 tables)

This paper contains 15 sections, 2 theorems, 1 equation, 5 figures, 5 tables.

Key Result

Proposition 1

$a_{\mathrm{pick}}$ and $a_{\mathrm{place}}$ are invariant to transformation $g\in \mathrm{SE}(3)$ acting on $P_{ab}$.

Figures (5)

  • Figure 1: Pipeline of $\textsc{Match Policy}$. (a). To generate the pick action, we register the gripper ($\hat{P}_a$) and the phone ($\hat{P}_b$) to the demonstrated combined point clouds ($P_{ab}$). The two registration poses $(\hat{T}_a, \hat{T}_b)$ are used to calculate the action transforming the gripper to desired pick configuration. (b). The place action prediction follows the similar pipeline.
  • Figure 2: 3D pick-place tasks from RLBench james2020rlbench. The top row shows the initial scene and the bottom row illustrate the completion state. The tasks are: Phone-on-Base, Stack-Wine, Put-Plate, Slide-Roll, Plug-Charger, and Insert-Knife.
  • Figure 3: Articulated Object Manipulation: Open Microwave. (a). grasp the handle of the microwave, (b). open the door of the microwave, (c). segmentations of the door with handle (black color) and microwave frame (red color).
  • Figure 4: Long Horizon Task: Put Item in Drawer. From left to right: (a). grasp the handle of the drawer; (b). open the drawer; (c). pick up the red block; (d). put the block in the drawer.
  • Figure 5: Real-robot tasks. The top row shows the observed real-sensor point clouds, the second row indicates the first pick action and the last row shows the complete state. Task from left to right: (a). putting banana; (b). hanging mug; (c). inserting flower; (d). pouring ball; (e). packing shoes; (f) arranging letters.

Theorems & Definitions (5)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • proof