Deformable Cluster Manipulation via Whole-Arm Policy Learning
Jayadeep Jacob, Wenzheng Zhang, Houston Warren, Paulo Borges, Tirthankar Bandyopadhyay, Fabio Ramos
TL;DR
This work tackles deformable cluster manipulation under occlusion by learning a model-free, multi-modal policy that operates with full-arm contact. It fuses segmented 3D point clouds with proprioceptive touch through a distributional state representation in RKHS using kernel mean embeddings, and it uses a context-agnostic occlusion reward to drive de-occlusion. Trained in massively parallel Isaac Gym simulations with domain randomization, the policy transfers zero-shot to real Kinova hardware equipped with a single RGB-D camera, aided by a robust real-vision pipeline. Across extensive ablations and real-world tests, the approach outperforms hand-crafted IK baselines and graph-based baselines, highlighting the value of global distributional features and proprioceptive contact cues for learned, whole-arm manipulation of deformable clusters. Limitations remain in perception quality and sim-to-real gaps, with future work aimed at enhanced perception, system identification, tactile sensing, and broader deployment to other deformable clearance tasks.
Abstract
Manipulating clusters of deformable objects presents a substantial challenge with widespread applicability, but requires contact-rich whole-arm interactions. A potential solution must address the limited capacity for realistic model synthesis, high uncertainty in perception, and the lack of efficient spatial abstractions, among others. We propose a novel framework for learning model-free policies integrating two modalities: 3D point clouds and proprioceptive touch indicators, emphasising manipulation with full body contact awareness, going beyond traditional end-effector modes. Our reinforcement learning framework leverages a distributional state representation, aided by kernel mean embeddings, to achieve improved training efficiency and real-time inference. Furthermore, we propose a novel context-agnostic occlusion heuristic to clear deformables from a target region for exposure tasks. We deploy the framework in a power line clearance scenario and observe that the agent generates creative strategies leveraging multiple arm links for de-occlusion. Finally, we perform zero-shot sim-to-real policy transfer, allowing the arm to clear real branches with unknown occlusion patterns, unseen topology, and uncertain dynamics. Website: https://sites.google.com/view/dcmwap/
