Adaptive Mobile Manipulation for Articulated Objects In the Open World
Haoyu Xiong, Russell Mendonca, Kenneth Shaw, Deepak Pathak
TL;DR
This work introduces Open-World Mobile Manipulation System, a full-stack framework for adaptive manipulation of articulated objects in open environments. It combines a structured, hierarchical action space with imitation pretraining and online RL to enable rapid adaptation to unseen doors, cabinets, and fridges. The approach is implemented on a low-cost mobile manipulator and validated across 20 training and 8 test objects in real buildings, achieving success improvements from ~50% to ~95% with online learning, and enabling autonomous reward from vision-language models. The results demonstrate practical viability for generalist mobile manipulators that continuously improve through interaction in real-world settings.
Abstract
Deploying robots in open-ended unstructured environments such as homes has been a long-standing research problem. However, robots are often studied only in closed-off lab settings, and prior mobile manipulation work is restricted to pick-move-place, which is arguably just the tip of the iceberg in this area. In this paper, we introduce Open-World Mobile Manipulation System, a full-stack approach to tackle realistic articulated object operation, e.g. real-world doors, cabinets, drawers, and refrigerators in open-ended unstructured environments. The robot utilizes an adaptive learning framework to initially learns from a small set of data through behavior cloning, followed by learning from online practice on novel objects that fall outside the training distribution. We also develop a low-cost mobile manipulation hardware platform capable of safe and autonomous online adaptation in unstructured environments with a cost of around 20,000 USD. In our experiments we utilize 20 articulate objects across 4 buildings in the CMU campus. With less than an hour of online learning for each object, the system is able to increase success rate from 50% of BC pre-training to 95% using online adaptation. Video results at https://open-world-mobilemanip.github.io/
