Table of Contents
Fetching ...

Catch It! Learning to Catch in Flight with Mobile Dexterous Hands

Yuanhang Zhang, Tianhai Liang, Zhenyang Chen, Yanjie Ze, Huazhe Xu

TL;DR

Catch It! addresses the challenge of catching thrown objects with a mobile manipulator by introducing a two-stage reinforcement learning framework that trains a unified whole-body policy for the base, arm, and dexterous hand. The method pre-trains the base/arm in a tracking task and then fine-tunes the hand policy during a catching task, with carefully designed rewards and sim2real techniques (LPF, system identification, domain randomization). In simulation, the approach achieves around 0.8 success on diverse objects and randomly thrown trajectories; it transfers to a real robot in a zero-shot manner, with practical improvements from sim2real techniques though catching performance is affected by object elasticity and sensor limitations. This work advances long-range, agile object capture and paves the way for robust human-robot handovers using mobile, dexterous manipulation in real-world settings.

Abstract

Catching objects in flight (i.e., thrown objects) is a common daily skill for humans, yet it presents a significant challenge for robots. This task requires a robot with agile and accurate motion, a large spatial workspace, and the ability to interact with diverse objects. In this paper, we build a mobile manipulator composed of a mobile base, a 6-DoF arm, and a 12-DoF dexterous hand to tackle such a challenging task. We propose a two-stage reinforcement learning framework to efficiently train a whole-body-control catching policy for this high-DoF system in simulation. The objects' throwing configurations, shapes, and sizes are randomized during training to enhance policy adaptivity to various trajectories and object characteristics in flight. The results show that our trained policy catches diverse objects with randomly thrown trajectories, at a high success rate of about 80\% in simulation, with a significant improvement over the baselines. The policy trained in simulation can be directly deployed in the real world with onboard sensing and computation, which achieves catching sandbags in various shapes, randomly thrown by humans. Our project page is available at https://mobile-dex-catch.github.io/.

Catch It! Learning to Catch in Flight with Mobile Dexterous Hands

TL;DR

Catch It! addresses the challenge of catching thrown objects with a mobile manipulator by introducing a two-stage reinforcement learning framework that trains a unified whole-body policy for the base, arm, and dexterous hand. The method pre-trains the base/arm in a tracking task and then fine-tunes the hand policy during a catching task, with carefully designed rewards and sim2real techniques (LPF, system identification, domain randomization). In simulation, the approach achieves around 0.8 success on diverse objects and randomly thrown trajectories; it transfers to a real robot in a zero-shot manner, with practical improvements from sim2real techniques though catching performance is affected by object elasticity and sensor limitations. This work advances long-range, agile object capture and paves the way for robust human-robot handovers using mobile, dexterous manipulation in real-world settings.

Abstract

Catching objects in flight (i.e., thrown objects) is a common daily skill for humans, yet it presents a significant challenge for robots. This task requires a robot with agile and accurate motion, a large spatial workspace, and the ability to interact with diverse objects. In this paper, we build a mobile manipulator composed of a mobile base, a 6-DoF arm, and a 12-DoF dexterous hand to tackle such a challenging task. We propose a two-stage reinforcement learning framework to efficiently train a whole-body-control catching policy for this high-DoF system in simulation. The objects' throwing configurations, shapes, and sizes are randomized during training to enhance policy adaptivity to various trajectories and object characteristics in flight. The results show that our trained policy catches diverse objects with randomly thrown trajectories, at a high success rate of about 80\% in simulation, with a significant improvement over the baselines. The policy trained in simulation can be directly deployed in the real world with onboard sensing and computation, which achieves catching sandbags in various shapes, randomly thrown by humans. Our project page is available at https://mobile-dex-catch.github.io/.
Paper Structure (30 sections, 6 figures, 2 tables)

This paper contains 30 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Sim2Real Illustration of Catching Motions.
  • Figure 2: System Overview. Our system comprises a mobile base, a 6-DoF arm, and a 16-DoF hand, whose goal is to catch objects thrown randomly by humans. The state difference (Diff.) is calculated between two consecutive timesteps. We employ a RGB-D camera to track the object in real-time, and a Mini-PC equipped with a GPU to handle onboard computation.
  • Figure 3: Two-Stage RL Framework. Note that we use two consecutive prioproception $O^{t, t-1}$ as the policy input.
  • Figure 4: Multi-Process Controller. A ROS-based multi-process controller synchronizes proprioceptive states and object position data for policy inference in real-time control of the mobile manipulator.
  • Figure 5: Object Set Overview. (a): (i) Objects in training; (ii) Objects in evaluation; (iii) Objects in real-world deployment. (b) Random Throwing Trajectory Visualization.
  • ...and 1 more figures