RobotMover: Learning to Move Large Objects From Human Demonstrations
Tianyu Li, Joanne Truong, Jimmy Yang, Alexander Clegg, Akshara Rai, Sehoon Ha, Xavier Puig
TL;DR
RobotMover addresses the challenge of moving large, heavy objects in unstructured environments by learning cross-embodiment manipulation from human demonstrations. It introduces the Interaction Chain, a low-dimensional, morphology-agnostic representation that captures key agent–object interaction dynamics and enables imitation rewards to transfer across human and robot morphologies. Policies are trained in a domain-randomized simulation and transferred zero-shot to real hardware (Spot), outperforming both learning-based baselines and teleoperation across chairs, tables, and racks, and enabling real-world tasks when combined with high-level planners. The work provides a scalable, generalizable framework for large object manipulation, with practical implications for household and industrial robotics and avenues for future enhancements such as autonomous grasping and unified multi-object policies.
Abstract
Moving large objects, such as furniture or appliances, is a critical capability for robots operating in human environments. This task presents unique challenges, including whole-body coordination to avoid collisions and managing the dynamics of bulky, heavy objects. In this work, we present RobotMover, a learning-based system for large object manipulation that uses human-object interaction demonstrations to train robot control policies. RobotMover formulates the manipulation problem as imitation learning using a simplified spatial representation called the Interaction Chain, which captures essential interaction dynamics in a way that generalizes across different robot bodies. We incorporate this Interaction Chain into a reward function and train policies in simulation using domain randomization to enable zero-shot transfer to real-world robots. The resulting policies allow a Spot robot to manipulate various large objects, including chairs, tables, and standing lamps. Through extensive experiments in both simulation and the real world, we show that RobotMover achieves strong performance in terms of capability, robustness, and controllability, outperforming both learned and teleoperation baselines. The system also supports practical applications by combining learned policies with simple planning modules to perform long-horizon object transport and rearrangement tasks.
