Table of Contents
Fetching ...

Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation

Mayank Mittal, David Hoeller, Farbod Farshidian, Marco Hutter, Animesh Garg

TL;DR

This work addresses autonomous articulated-object interaction in unknown, dynamic environments using whole-body mobile manipulation. It introduces a two-stage framework with an object-centric planner (scene interpretation via ANCSH and keyframe generation) and an agent-centric MPC-based planner (with ESDF-based collision avoidance) to safely execute plans on wheel-based and legged robots. Hardware and simulation experiments demonstrate that the MPC-based planner yields substantially higher success rates and faster task completion than IK-based baselines, validating the joint perception-control approach for kitchen-scale articulated objects. By integrating perception-driven object models with principled real-time control, the framework enables robust, efficient autonomous manipulation in unstructured spaces.

Abstract

A kitchen assistant needs to operate human-scale objects, such as cabinets and ovens, in unmapped environments with dynamic obstacles. Autonomous interactions in such environments require integrating dexterous manipulation and fluid mobility. While mobile manipulators in different form factors provide an extended workspace, their real-world adoption has been limited. Executing a high-level task for general objects requires a perceptual understanding of the object as well as adaptive whole-body control among dynamic obstacles. In this paper, we propose a two-stage architecture for autonomous interaction with large articulated objects in unknown environments. The first stage, object-centric planner, only focuses on the object to provide an action-conditional sequence of states for manipulation using RGB-D data. The second stage, agent-centric planner, formulates the whole-body motion control as an optimal control problem that ensures safe tracking of the generated plan, even in scenes with moving obstacles. We show that the proposed pipeline can handle complex static and dynamic kitchen settings for both wheel-based and legged mobile manipulators. Compared to other agent-centric planners, our proposed planner achieves a higher success rate and a lower execution time. We also perform hardware tests on a legged mobile manipulator to interact with various articulated objects in a kitchen. For additional material, please check: www.pair.toronto.edu/articulated-mm/.

Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation

TL;DR

This work addresses autonomous articulated-object interaction in unknown, dynamic environments using whole-body mobile manipulation. It introduces a two-stage framework with an object-centric planner (scene interpretation via ANCSH and keyframe generation) and an agent-centric MPC-based planner (with ESDF-based collision avoidance) to safely execute plans on wheel-based and legged robots. Hardware and simulation experiments demonstrate that the MPC-based planner yields substantially higher success rates and faster task completion than IK-based baselines, validating the joint perception-control approach for kitchen-scale articulated objects. By integrating perception-driven object models with principled real-time control, the framework enables robust, efficient autonomous manipulation in unstructured spaces.

Abstract

A kitchen assistant needs to operate human-scale objects, such as cabinets and ovens, in unmapped environments with dynamic obstacles. Autonomous interactions in such environments require integrating dexterous manipulation and fluid mobility. While mobile manipulators in different form factors provide an extended workspace, their real-world adoption has been limited. Executing a high-level task for general objects requires a perceptual understanding of the object as well as adaptive whole-body control among dynamic obstacles. In this paper, we propose a two-stage architecture for autonomous interaction with large articulated objects in unknown environments. The first stage, object-centric planner, only focuses on the object to provide an action-conditional sequence of states for manipulation using RGB-D data. The second stage, agent-centric planner, formulates the whole-body motion control as an optimal control problem that ensures safe tracking of the generated plan, even in scenes with moving obstacles. We show that the proposed pipeline can handle complex static and dynamic kitchen settings for both wheel-based and legged mobile manipulators. Compared to other agent-centric planners, our proposed planner achieves a higher success rate and a lower execution time. We also perform hardware tests on a legged mobile manipulator to interact with various articulated objects in a kitchen. For additional material, please check: www.pair.toronto.edu/articulated-mm/.

Paper Structure

This paper contains 23 sections, 4 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: We present a framework for articulated object interaction that can handle unknown spaces, variations within an object category, and dynamic scenes. We consider both wheel-based and legged mobile manipulation systems.
  • Figure 2: The two-level hierarchy in the proposed framework. The object-centric planner comprises of a scene interpreter and keyframe generator. It uses perceptual information to generate task space plans. The agent-centric planner follows the computed plan while satisfying constraints and performing online collision avoidance.
  • Figure 3: We desire an agent-centric planner ($K$) that maximizes the feasibility of object-centric plans ($O$) for a given robot $A$. Left: When the robot is physically incapable of manipulating the object. Middle: When the robot is capable, but the agent-centric planner limits viable motions. Right: Ideal scenario where the agent-centric planner maximizes the feasible set as much as possible.
  • Figure 4: Different kitchen scenes designed in NVIDIA IsaacSim nvidia2020omniverse. The kitchens differ in architecture, free space for mobility, and articulated objects instances.
  • Figure 5: Mobile manipulation platforms considered in this work: (a) Mabi-Mobile with a wheel-base, (b) ALMA with a legged-base. For collision checking, we approximate the robot with collision spheres.
  • ...and 2 more figures