One-Shot Transfer of Long-Horizon Extrinsic Manipulation Through Contact Retargeting
Albert Wu, Ruocheng Wang, Sirui Chen, Clemens Eppner, C. Karen Liu
TL;DR
The paper addresses generalizing long-horizon extrinsic manipulation by retargeting contact requirements from a single demonstration. It introduces a primitive library of four short-horizon, goal-conditioned policies and an IK-based contact retargeting framework to map demonstrations to new scenes while preserving the sequence of contact configurations. Hardware experiments across four tasks, ten objects, and six environments achieve an overall success rate of $80.5\%$ (with $81.7\%$ on standard objects), demonstrating robustness to demonstration variation and environment changes. By enabling reliable chaining of primitives through contact retargeting, the approach offers a scalable path toward real-world extrinsic manipulation with limited demonstrations.
Abstract
Extrinsic manipulation, the use of environment contacts to achieve manipulation objectives, enables strategies that are otherwise impossible with a parallel jaw gripper. However, orchestrating a long-horizon sequence of contact interactions between the robot, object, and environment is notoriously challenging due to the scene diversity, large action space, and difficult contact dynamics. We observe that most extrinsic manipulation are combinations of short-horizon primitives, each of which depend strongly on initializing from a desirable contact configuration to succeed. Therefore, we propose to generalize one extrinsic manipulation trajectory to diverse objects and environments by retargeting contact requirements. We prepare a single library of robust short-horizon, goal-conditioned primitive policies, and design a framework to compose state constraints stemming from contacts specifications of each primitive. Given a test scene and a single demo prescribing the primitive sequence, our method enforces the state constraints on the test scene and find intermediate goal states using inverse kinematics. The goals are then tracked by the primitive policies. Using a 7+1 DoF robotic arm-gripper system, we achieved an overall success rate of 80.5% on hardware over 4 long-horizon extrinsic manipulation tasks, each with up to 4 primitives. Our experiments cover 10 objects and 6 environment configurations. We further show empirically that our method admits a wide range of demonstrations, and that contact retargeting is indeed the key to successfully combining primitives for long-horizon extrinsic manipulation. Code and additional details are available at stanford-tml.github.io/extrinsic-manipulation.
