Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting
Lawrence Yunliang Chen, Kush Hari, Karthik Dharmarajan, Chenfeng Xu, Quan Vuong, Ken Goldberg
TL;DR
Mirage enables zero-shot policy transfer across robot embodiments by decoupling vision and control: cross-painting masks the target robot and renders the source robot in its place, while a forward-dynamics model coupled with a blocking controller adapts source actions to the target. The approach is validated across nine manipulation tasks in simulation and real hardware, showing Mirage consistently outperforms a state-of-the-art generalist model and enabling robust transfer even with gripper and robot changes. Key contributions include a systematic simulation study, the cross-painting transfer strategy, and extensive real-world demonstrations, highlighting a practical path to reuse policies across diverse robotic platforms. The work suggests that leveraging robots’ URDFs and aligned end-effector action spaces can substantially reduce data collection and training needs for multi-robot manipulation.
Abstract
The ability to reuse collected data and transfer trained policies between robots could alleviate the burden of additional data collection and training. While existing approaches such as pretraining plus finetuning and co-training show promise, they do not generalize to robots unseen in training. Focusing on common robot arms with similar workspaces and 2-jaw grippers, we investigate the feasibility of zero-shot transfer. Through simulation studies on 8 manipulation tasks, we find that state-based Cartesian control policies can successfully zero-shot transfer to a target robot after accounting for forward dynamics. To address robot visual disparities for vision-based policies, we introduce Mirage, which uses "cross-painting"--masking out the unseen target robot and inpainting the seen source robot--during execution in real time so that it appears to the policy as if the trained source robot were performing the task. Mirage applies to both first-person and third-person camera views and policies that take in both states and images as inputs or only images as inputs. Despite its simplicity, our extensive simulation and physical experiments provide strong evidence that Mirage can successfully zero-shot transfer between different robot arms and grippers with only minimal performance degradation on a variety of manipulation tasks such as picking, stacking, and assembly, significantly outperforming a generalist policy. Project website: https://robot-mirage.github.io/
