Composable Part-Based Manipulation
Weiyu Liu, Jiayuan Mao, Joy Hsu, Tucker Hermans, Animesh Garg, Jiajun Wu
TL;DR
CPM addresses generalization in robotic manipulation by decomposing objects into functional parts and learning part-part correspondences as probabilistic constraints over $SE(3)$ pose trajectories. It trains a collection of conditional diffusion models, one per correspondence, and performs inference-time composition to sample trajectories that satisfy all constraints. The approach is validated on pouring and safe-placing tasks with both simulated data from PartNet/ShapeNetSem and zero-shot real-world transfer, outperforming several baselines. The results show strong cross-category generalization and robustness to geometric variation, suggesting a scalable route to general-purpose, composable manipulation skills.
Abstract
In this paper, we propose composable part-based manipulation (CPM), a novel approach that leverages object-part decomposition and part-part correspondences to improve learning and generalization of robotic manipulation skills. By considering the functional correspondences between object parts, we conceptualize functional actions, such as pouring and constrained placing, as combinations of different correspondence constraints. CPM comprises a collection of composable diffusion models, where each model captures a different inter-object correspondence. These diffusion models can generate parameters for manipulation skills based on the specific object parts. Leveraging part-based correspondences coupled with the task decomposition into distinct constraints enables strong generalization to novel objects and object categories. We validate our approach in both simulated and real-world scenarios, demonstrating its effectiveness in achieving robust and generalized manipulation capabilities.
