Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation
Mayank Mittal, David Hoeller, Farbod Farshidian, Marco Hutter, Animesh Garg
TL;DR
This work addresses autonomous articulated-object interaction in unknown, dynamic environments using whole-body mobile manipulation. It introduces a two-stage framework with an object-centric planner (scene interpretation via ANCSH and keyframe generation) and an agent-centric MPC-based planner (with ESDF-based collision avoidance) to safely execute plans on wheel-based and legged robots. Hardware and simulation experiments demonstrate that the MPC-based planner yields substantially higher success rates and faster task completion than IK-based baselines, validating the joint perception-control approach for kitchen-scale articulated objects. By integrating perception-driven object models with principled real-time control, the framework enables robust, efficient autonomous manipulation in unstructured spaces.
Abstract
A kitchen assistant needs to operate human-scale objects, such as cabinets and ovens, in unmapped environments with dynamic obstacles. Autonomous interactions in such environments require integrating dexterous manipulation and fluid mobility. While mobile manipulators in different form factors provide an extended workspace, their real-world adoption has been limited. Executing a high-level task for general objects requires a perceptual understanding of the object as well as adaptive whole-body control among dynamic obstacles. In this paper, we propose a two-stage architecture for autonomous interaction with large articulated objects in unknown environments. The first stage, object-centric planner, only focuses on the object to provide an action-conditional sequence of states for manipulation using RGB-D data. The second stage, agent-centric planner, formulates the whole-body motion control as an optimal control problem that ensures safe tracking of the generated plan, even in scenes with moving obstacles. We show that the proposed pipeline can handle complex static and dynamic kitchen settings for both wheel-based and legged mobile manipulators. Compared to other agent-centric planners, our proposed planner achieves a higher success rate and a lower execution time. We also perform hardware tests on a legged mobile manipulator to interact with various articulated objects in a kitchen. For additional material, please check: www.pair.toronto.edu/articulated-mm/.
