Table of Contents
Fetching ...

AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning

Yuanfei Wang, Xiaojie Zhang, Ruihai Wu, Yu Li, Yan Shen, Mingdong Wu, Zhaofeng He, Yizhou Wang, Hao Dong

TL;DR

Adaptive manipulation of articulated objects with hidden internal states requires policies that react to failure and state feedback. AdaManip provides a diverse simulation environment with 9 object categories and 5 adaptive mechanisms, paired with an adaptive demonstration collection and a 3D diffusion-based imitation learning method. The diffusion-based approach models multi-modal action distributions conditioned on history and partial observations, enabling adaptive recovery strategies. Real-world experiments with a Panda robot demonstrate practical generalization and recovery behavior, underscoring the framework's potential for real-world household manipulation tasks.

Abstract

Articulated object manipulation is a critical capability for robots to perform various tasks in real-world scenarios. Composed of multiple parts connected by joints, articulated objects are endowed with diverse functional mechanisms through complex relative motions. For example, a safe consists of a door, a handle, and a lock, where the door can only be opened when the latch is unlocked. The internal structure, such as the state of a lock or joint angle constraints, cannot be directly observed from visual observation. Consequently, successful manipulation of these objects requires adaptive adjustment based on trial and error rather than a one-time visual inference. However, previous datasets and simulation environments for articulated objects have primarily focused on simple manipulation mechanisms where the complete manipulation process can be inferred from the object's appearance. To enhance the diversity and complexity of adaptive manipulation mechanisms, we build a novel articulated object manipulation environment and equip it with 9 categories of objects. Based on the environment and objects, we further propose an adaptive demonstration collection and 3D visual diffusion-based imitation learning pipeline that learns the adaptive manipulation policy. The effectiveness of our designs and proposed method is validated through both simulation and real-world experiments. Our project page is available at: https://adamanip.github.io

AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning

TL;DR

Adaptive manipulation of articulated objects with hidden internal states requires policies that react to failure and state feedback. AdaManip provides a diverse simulation environment with 9 object categories and 5 adaptive mechanisms, paired with an adaptive demonstration collection and a 3D diffusion-based imitation learning method. The diffusion-based approach models multi-modal action distributions conditioned on history and partial observations, enabling adaptive recovery strategies. Real-world experiments with a Panda robot demonstrate practical generalization and recovery behavior, underscoring the framework's potential for real-world household manipulation tasks.

Abstract

Articulated object manipulation is a critical capability for robots to perform various tasks in real-world scenarios. Composed of multiple parts connected by joints, articulated objects are endowed with diverse functional mechanisms through complex relative motions. For example, a safe consists of a door, a handle, and a lock, where the door can only be opened when the latch is unlocked. The internal structure, such as the state of a lock or joint angle constraints, cannot be directly observed from visual observation. Consequently, successful manipulation of these objects requires adaptive adjustment based on trial and error rather than a one-time visual inference. However, previous datasets and simulation environments for articulated objects have primarily focused on simple manipulation mechanisms where the complete manipulation process can be inferred from the object's appearance. To enhance the diversity and complexity of adaptive manipulation mechanisms, we build a novel articulated object manipulation environment and equip it with 9 categories of objects. Based on the environment and objects, we further propose an adaptive demonstration collection and 3D visual diffusion-based imitation learning pipeline that learns the adaptive manipulation policy. The effectiveness of our designs and proposed method is validated through both simulation and real-world experiments. Our project page is available at: https://adamanip.github.io

Paper Structure

This paper contains 23 sections, 2 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Example comparison between Static and Adaptive Policies. The safe can be directly opened if unlocked; otherwise, the key must be turned to unlock the latch before opening the door. However, it is impossible to figure out the lock state from pure visual observations. Static Policy: The demonstrations for training the static policy are optimal trajectories under full observation, including both locked and unlocked states. Consequently, the learned policy is a bimodal distribution based on visual observation alone. If the robot samples the "unlocked trajectory" and fails to open the locked door, it will be out of distribution. Adaptive Policy: The demonstrations for training the adaptive policy include recovery from the failed door opening. Therefore, the policy learns to first pull the door to check the lock state and updates the policy distribution accordingly based on the feedback.
  • Figure 2: Adaptive manipulation dataset and environments. Bottle and Pressure Cooker feature the Rotate & Slide mechanism, requiring continued rotation after a failed lift. The window includes the Lock and Random Rotation Direction mechanisms, necessitating exploration of the correct rotation direction to unlock the latch. Microwave incorporates the Lock and Switch Contact mechanisms, where the robot must first pull the handle to check the lock state and press the button if locked.
  • Figure 3: Adaptive demonstration collection: Given the uncertain lock state of a microwave, we instruct the robot to first pull the door to check if it is locked, and then follow two different trajectories based on the result. Diffusion-based 3D adaptive manipulation policy: Conditioning on the history of 3D visual features, proprioceptions, and actions, the policy denoises Gaussian noise into the trajectory distribution. Initially, the policy captures the bimodal distribution in the demonstration based on the initial observation. As the observed lock state is determined, the policy distribution adaptively shifts to an unimodal distribution.
  • Figure 4: Adaptive Environments and Qualitative Manipulation Results. This figure shows the manipulation results of object categories apart from Figure \ref{['fig:fig2']}.
  • Figure 5: Manipulation Trajectories Proposed by Our Method and Others. Our method can sequentially propose stable and accurate adaptive actions, while others have their respective drawbacks.
  • ...and 3 more figures