Harmonic Mobile Manipulation
Ruihan Yang, Yejin Kim, Rose Hendrix, Aniruddha Kembhavi, Xiaolong Wang, Kiana Ehsani
TL;DR
Harmonic Mobile Manipulation (HarmonicMM) addresses the challenge of coordinating navigation and manipulation for complex daily tasks in household environments. It proposes an end-to-end reinforcement learning approach that jointly optimizes base movement and arm actions using RGB vision from two views plus proprioception, trained in photorealistic ProcTHOR simulations and transferred to a real apartment without fine-tuning. The paper introduces the Daily Mobile Manipulation Task Suite (including Opening Door, Cleaning Table, and Opening Fridge) and demonstrates that HarmonicMM outperforms two-stage baselines in simulation and achieves meaningful real-world success rates, with ablations showing the importance of multi-view perception and pretrained visual encoders. The work highlights practical implications for indoor robot deployment by enabling robust, vision-based, end-to-end control, while acknowledging limitations related to physical capabilities and proposing future extensions to more dynamic tasks and broader environments.
Abstract
Recent advancements in robotics have enabled robots to navigate complex scenes or manipulate diverse objects independently. However, robots are still impotent in many household tasks requiring coordinated behaviors such as opening doors. The factorization of navigation and manipulation, while effective for some tasks, fails in scenarios requiring coordinated actions. To address this challenge, we introduce, HarmonicMM, an end-to-end learning method that optimizes both navigation and manipulation, showing notable improvement over existing techniques in everyday tasks. This approach is validated in simulated and real-world environments and adapts to novel unseen settings without additional tuning. Our contributions include a new benchmark for mobile manipulation and the successful deployment with only RGB visual observation in a real unseen apartment, demonstrating the potential for practical indoor robot deployment in daily life. More results are on our project site: https://rchalyang.github.io/HarmonicMM/
