Table of Contents
Fetching ...

RobotMover: Learning to Move Large Objects From Human Demonstrations

Tianyu Li, Joanne Truong, Jimmy Yang, Alexander Clegg, Akshara Rai, Sehoon Ha, Xavier Puig

TL;DR

RobotMover addresses the challenge of moving large, heavy objects in unstructured environments by learning cross-embodiment manipulation from human demonstrations. It introduces the Interaction Chain, a low-dimensional, morphology-agnostic representation that captures key agent–object interaction dynamics and enables imitation rewards to transfer across human and robot morphologies. Policies are trained in a domain-randomized simulation and transferred zero-shot to real hardware (Spot), outperforming both learning-based baselines and teleoperation across chairs, tables, and racks, and enabling real-world tasks when combined with high-level planners. The work provides a scalable, generalizable framework for large object manipulation, with practical implications for household and industrial robotics and avenues for future enhancements such as autonomous grasping and unified multi-object policies.

Abstract

Moving large objects, such as furniture or appliances, is a critical capability for robots operating in human environments. This task presents unique challenges, including whole-body coordination to avoid collisions and managing the dynamics of bulky, heavy objects. In this work, we present RobotMover, a learning-based system for large object manipulation that uses human-object interaction demonstrations to train robot control policies. RobotMover formulates the manipulation problem as imitation learning using a simplified spatial representation called the Interaction Chain, which captures essential interaction dynamics in a way that generalizes across different robot bodies. We incorporate this Interaction Chain into a reward function and train policies in simulation using domain randomization to enable zero-shot transfer to real-world robots. The resulting policies allow a Spot robot to manipulate various large objects, including chairs, tables, and standing lamps. Through extensive experiments in both simulation and the real world, we show that RobotMover achieves strong performance in terms of capability, robustness, and controllability, outperforming both learned and teleoperation baselines. The system also supports practical applications by combining learned policies with simple planning modules to perform long-horizon object transport and rearrangement tasks.

RobotMover: Learning to Move Large Objects From Human Demonstrations

TL;DR

RobotMover addresses the challenge of moving large, heavy objects in unstructured environments by learning cross-embodiment manipulation from human demonstrations. It introduces the Interaction Chain, a low-dimensional, morphology-agnostic representation that captures key agent–object interaction dynamics and enables imitation rewards to transfer across human and robot morphologies. Policies are trained in a domain-randomized simulation and transferred zero-shot to real hardware (Spot), outperforming both learning-based baselines and teleoperation across chairs, tables, and racks, and enabling real-world tasks when combined with high-level planners. The work provides a scalable, generalizable framework for large object manipulation, with practical implications for household and industrial robotics and avenues for future enhancements such as autonomous grasping and unified multi-object policies.

Abstract

Moving large objects, such as furniture or appliances, is a critical capability for robots operating in human environments. This task presents unique challenges, including whole-body coordination to avoid collisions and managing the dynamics of bulky, heavy objects. In this work, we present RobotMover, a learning-based system for large object manipulation that uses human-object interaction demonstrations to train robot control policies. RobotMover formulates the manipulation problem as imitation learning using a simplified spatial representation called the Interaction Chain, which captures essential interaction dynamics in a way that generalizes across different robot bodies. We incorporate this Interaction Chain into a reward function and train policies in simulation using domain randomization to enable zero-shot transfer to real-world robots. The resulting policies allow a Spot robot to manipulate various large objects, including chairs, tables, and standing lamps. Through extensive experiments in both simulation and the real world, we show that RobotMover achieves strong performance in terms of capability, robustness, and controllability, outperforming both learned and teleoperation baselines. The system also supports practical applications by combining learned policies with simple planning modules to perform long-horizon object transport and rearrangement tasks.

Paper Structure

This paper contains 31 sections, 6 equations, 12 figures, 1 algorithm.

Figures (12)

  • Figure 1: RobotMover enables robots to move a variety of large objects.
  • Figure 2: Challenges of moving large objects include but not limit to object colliding with the robot and object falls off from the robot's gripper due to high momentum.
  • Figure 3: Method Overview. RobotMover enables robots to learn to move large objects by imitating human-object interaction demonstrations. The framework leverages a novel representation, the Dynamic Chain, which captures the interaction dynamics between the agent and object while remaining agnostic to the agent’s embodiment. This representation is used to design an imitation reward that guides the robot’s imitation learning process.
  • Figure 4: Interaction Graph (top) vs. Interaction Chain (bottom). The Interaction Chain offers a simpler representation to describe agent-object interactions, which can better transfer to other morphologies. This allows us to use a human demonstration to guide the reward of a robot policy.
  • Figure 5: Different human-object interaction strategies lead to different Interaction Chains.
  • ...and 7 more figures