Table of Contents
Fetching ...

JuggleRL: Mastering Ball Juggling with a Quadrotor via Deep Reinforcement Learning

Shilong Ji, Yinuo Chen, Chuqi Wang, Jiayu Chen, Ruize Zhang, Feng Gao, Wenhao Tang, Shu'ang Yu, Sirui Xiang, Xinlei Chen, Chao Yu, Yu Wang

TL;DR

JuggleRL tackles the problem of aerial ball juggling with a quadrotor by marrying system identification, large-scale PPO-based reinforcement learning, and domain randomization to close the sim-to-real gap. The method yields a zero-shot deployment pipeline with a latency-aware perception stack and achieves state-of-the-art real-world performance, including up to 462 consecutive hits and robust generalization to unseen ball weights. The work demonstrates that model-free reinforcement learning can deliver robust, reactive control for contact-rich aerial manipulation under uncertainty, with potential extensions to onboard vision and multi-agent coordination. Overall, JuggleRL represents a significant advance in autonomous, interactive aerial robotics with practical implications for dynamic object manipulation.

Abstract

Aerial robots interacting with objects must perform precise, contact-rich maneuvers under uncertainty. In this paper, we study the problem of aerial ball juggling using a quadrotor equipped with a racket, a task that demands accurate timing, stable control, and continuous adaptation. We propose JuggleRL, the first reinforcement learning-based system for aerial juggling. It learns closed-loop policies in large-scale simulation using systematic calibration of quadrotor and ball dynamics to reduce the sim-to-real gap. The training incorporates reward shaping to encourage racket-centered hits and sustained juggling, as well as domain randomization over ball position and coefficient of restitution to enhance robustness and transferability. The learned policy outputs mid-level commands executed by a low-level controller and is deployed zero-shot on real hardware, where an enhanced perception module with a lightweight communication protocol reduces delays in high-frequency state estimation and ensures real-time control. Experiments show that JuggleRL achieves an average of $311$ hits over $10$ consecutive trials in the real world, with a maximum of $462$ hits observed, far exceeding a model-based baseline that reaches at most $14$ hits with an average of $3.1$. Moreover, the policy generalizes to unseen conditions, successfully juggling a lighter $5$ g ball with an average of $145.9$ hits. This work demonstrates that reinforcement learning can empower aerial robots with robust and stable control in dynamic interaction tasks.

JuggleRL: Mastering Ball Juggling with a Quadrotor via Deep Reinforcement Learning

TL;DR

JuggleRL tackles the problem of aerial ball juggling with a quadrotor by marrying system identification, large-scale PPO-based reinforcement learning, and domain randomization to close the sim-to-real gap. The method yields a zero-shot deployment pipeline with a latency-aware perception stack and achieves state-of-the-art real-world performance, including up to 462 consecutive hits and robust generalization to unseen ball weights. The work demonstrates that model-free reinforcement learning can deliver robust, reactive control for contact-rich aerial manipulation under uncertainty, with potential extensions to onboard vision and multi-agent coordination. Overall, JuggleRL represents a significant advance in autonomous, interactive aerial robotics with practical implications for dynamic object manipulation.

Abstract

Aerial robots interacting with objects must perform precise, contact-rich maneuvers under uncertainty. In this paper, we study the problem of aerial ball juggling using a quadrotor equipped with a racket, a task that demands accurate timing, stable control, and continuous adaptation. We propose JuggleRL, the first reinforcement learning-based system for aerial juggling. It learns closed-loop policies in large-scale simulation using systematic calibration of quadrotor and ball dynamics to reduce the sim-to-real gap. The training incorporates reward shaping to encourage racket-centered hits and sustained juggling, as well as domain randomization over ball position and coefficient of restitution to enhance robustness and transferability. The learned policy outputs mid-level commands executed by a low-level controller and is deployed zero-shot on real hardware, where an enhanced perception module with a lightweight communication protocol reduces delays in high-frequency state estimation and ensures real-time control. Experiments show that JuggleRL achieves an average of hits over consecutive trials in the real world, with a maximum of hits observed, far exceeding a model-based baseline that reaches at most hits with an average of . Moreover, the policy generalizes to unseen conditions, successfully juggling a lighter g ball with an average of hits. This work demonstrates that reinforcement learning can empower aerial robots with robust and stable control in dynamic interaction tasks.

Paper Structure

This paper contains 21 sections, 7 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Overview of the JuggleRL system. The system combines SysID-based dynamics calibration, large-scale GPU-parallel training in Isaac Sim with domain randomization to learn robust juggling policies. The learned policy outputs mid-level CTBR commands executed by a low-level PID controller, and is deployed zero-shot on real hardware with an enhanced perception module using a lightweight communication protocol to minimize latency in high-frequency state estimation.
  • Figure 2: Visualization of the restitution coefficient between the racket and the ball. The central "sweet spot" exhibits a higher restitution coefficient (average $0.82$), while the periphery is less elastic (average $0.64$), motivating our domain randomization strategy.
  • Figure 3: Latency comparison with the lightweight communication protocol (LCP). Using the original communication scheme, the velocity signal exhibits step-like latency even at a 200 Hz publish rate. In contrast, LCP enables smooth, real-time transmission of velocity.
  • Figure 4: Performance comparison between JuggleRL and MBPP in simulation. JuggleRL demonstrates robust performance across all tested ball release heights, while MBPP fails at lower release heights.
  • Figure 5: Real-world juggling trajectory of a trial with a $0.13$ m horizontal offset, ball releasing from $1.68$ m.
  • ...and 3 more figures