Table of Contents
Fetching ...

Learning Dexterous Bimanual Catch Skills through Adversarial-Cooperative Heterogeneous-Agent Reinforcement Learning

Taewoo Kim, Youngwoo Yoon, Jaehong Kim

TL;DR

The paper tackles dexterous bimanual catching by introducing a heterogeneous-agent reinforcement learning framework with an adversarial-cooperative reward between a virtual thrower and a two‑handed catcher. The method combines centralized training with decentralized execution (CTDE) using a HAPPO objective and a compound reward $r_{\text{total}} = [t] \alpha r_{\text{catch}} + (1-\alpha) r_{\text{throw}}$, enabling adaptive throwing difficulty and robust catching across 15 object types. Key contributions include the first integration of bimanual catching within HARL, a novel adversarial-cooperative learning dynamic, and extensive simulation validation showing approximately a twofold improvement over single-agent baselines. The approach yields an implicit curriculum that enhances learning efficiency and generalization, though real-world deployment and broader object diversity remain avenues for future work.

Abstract

Robotic catching has traditionally focused on single-handed systems, which are limited in their ability to handle larger or more complex objects. In contrast, bimanual catching offers significant potential for improved dexterity and object handling but introduces new challenges in coordination and control. In this paper, we propose a novel framework for learning dexterous bimanual catching skills using Heterogeneous-Agent Reinforcement Learning (HARL). Our approach introduces an adversarial reward scheme, where a throw agent increases the difficulty of throws-adjusting speed-while a catch agent learns to coordinate both hands to catch objects under these evolving conditions. We evaluate the framework in simulated environments using 15 different objects, demonstrating robustness and versatility in handling diverse objects. Our method achieved approximately a 2x increase in catching reward compared to single-agent baselines across 15 diverse objects.

Learning Dexterous Bimanual Catch Skills through Adversarial-Cooperative Heterogeneous-Agent Reinforcement Learning

TL;DR

The paper tackles dexterous bimanual catching by introducing a heterogeneous-agent reinforcement learning framework with an adversarial-cooperative reward between a virtual thrower and a two‑handed catcher. The method combines centralized training with decentralized execution (CTDE) using a HAPPO objective and a compound reward , enabling adaptive throwing difficulty and robust catching across 15 object types. Key contributions include the first integration of bimanual catching within HARL, a novel adversarial-cooperative learning dynamic, and extensive simulation validation showing approximately a twofold improvement over single-agent baselines. The approach yields an implicit curriculum that enhances learning efficiency and generalization, though real-world deployment and broader object diversity remain avenues for future work.

Abstract

Robotic catching has traditionally focused on single-handed systems, which are limited in their ability to handle larger or more complex objects. In contrast, bimanual catching offers significant potential for improved dexterity and object handling but introduces new challenges in coordination and control. In this paper, we propose a novel framework for learning dexterous bimanual catching skills using Heterogeneous-Agent Reinforcement Learning (HARL). Our approach introduces an adversarial reward scheme, where a throw agent increases the difficulty of throws-adjusting speed-while a catch agent learns to coordinate both hands to catch objects under these evolving conditions. We evaluate the framework in simulated environments using 15 different objects, demonstrating robustness and versatility in handling diverse objects. Our method achieved approximately a 2x increase in catching reward compared to single-agent baselines across 15 diverse objects.

Paper Structure

This paper contains 16 sections, 5 equations, 9 figures.

Figures (9)

  • Figure 1: Dexterous bimanual catch skill is learned through adversarial cooperation. The catch agent with two arms and hands and the virtual throw agent are modeled for heterogeneous-agent reinforcement learning.
  • Figure 2: Throw-catch agents in a heterogeneous-agent RL framework. The throw agent, $\pi_{\text{throw}}$, sets the initial velocity of a random object at the start of each episode, while the catch agent, $\pi_{\text{catch}}$, learns to catch using a reward based on the object's distance to both hands. The adversarial-cooperative reward structure, shown in the center, reflects a paradoxical balance: when the action-induced velocity is high (red), the catch reward decreases due to adversarial difficulty, but the thrower gains a high reward. Conversely, when the velocity is low (blue), the catch reward increases, but the thrower receives a lower reward, highlighting the cooperative aspect. The aggregate reward is used for training via the HAPPO algorithm.
  • Figure 3: 15 objects in different sizes and shapes used in the experiments.
  • Figure 4: Catch agent performing bimanual catching of 15 different objects thrown randomly.
  • Figure 5: Box plots of catcher rewards for single-agent and proposed heterogeneous agents. The box represents the interquartile range from the 25th to the 75th percentile, with whiskers extending to the minimum and maximum non-outlier values. The blue line indicates the median reward, while the numbers inside the boxes represent the mean rewards. The proposed HA (Fixed, $\alpha=0.7$) achieves better performance than the other methods.
  • ...and 4 more figures