Table of Contents
Fetching ...

Swooper: Learning High-Speed Aerial Grasping With a Simple Gripper

Ziken Huang, Xinze Niu, Bowen Chai, Renbiao Jin, Danping Zou

TL;DR

This work proposes Swooper, a deep reinforcement learning (DRL) based approach that achieves both precise flight control and active gripper control using a single lightweight neural network policy that seamlessly integrates high-speed flight and grasping.

Abstract

High-speed aerial grasping presents significant challenges due to the high demands on precise, responsive flight control and coordinated gripper manipulation. In this work, we propose Swooper, a deep reinforcement learning (DRL) based approach that achieves both precise flight control and active gripper control using a single lightweight neural network policy. Training such a policy directly via DRL is nontrivial due to the complexity of coordinating flight and grasping. To address this, we adopt a two-stage learning strategy: we first pre-train a flight control policy, and then fine-tune it to acquire grasping skills. With the carefully designed reward functions and training framework, the entire training process completes in under 60 minutes on a standard desktop with an Nvidia RTX 3060 GPU. To validate the trained policy in the real world, we develop a lightweight quadrotor grasping platform equipped with a simple off-the-shelf gripper, and deploy the policy in a zero-shot manner on the onboard Raspberry Pi 4B computer, where each inference takes only about 1.0 ms. In 25 real-world trials, our policy achieves an 84% grasp success rate and grasping speeds of up to 1.5 m/s without any fine-tuning. This matches the robustness and agility of state-of-the-art classical systems with sophisticated grippers, highlighting the capability of DRL for learning a robust control policy that seamlessly integrates high-speed flight and grasping. The supplementary video is available for more results. Video: https://zikenhuang.github.io/Swooper/.

Swooper: Learning High-Speed Aerial Grasping With a Simple Gripper

TL;DR

This work proposes Swooper, a deep reinforcement learning (DRL) based approach that achieves both precise flight control and active gripper control using a single lightweight neural network policy that seamlessly integrates high-speed flight and grasping.

Abstract

High-speed aerial grasping presents significant challenges due to the high demands on precise, responsive flight control and coordinated gripper manipulation. In this work, we propose Swooper, a deep reinforcement learning (DRL) based approach that achieves both precise flight control and active gripper control using a single lightweight neural network policy. Training such a policy directly via DRL is nontrivial due to the complexity of coordinating flight and grasping. To address this, we adopt a two-stage learning strategy: we first pre-train a flight control policy, and then fine-tune it to acquire grasping skills. With the carefully designed reward functions and training framework, the entire training process completes in under 60 minutes on a standard desktop with an Nvidia RTX 3060 GPU. To validate the trained policy in the real world, we develop a lightweight quadrotor grasping platform equipped with a simple off-the-shelf gripper, and deploy the policy in a zero-shot manner on the onboard Raspberry Pi 4B computer, where each inference takes only about 1.0 ms. In 25 real-world trials, our policy achieves an 84% grasp success rate and grasping speeds of up to 1.5 m/s without any fine-tuning. This matches the robustness and agility of state-of-the-art classical systems with sophisticated grippers, highlighting the capability of DRL for learning a robust control policy that seamlessly integrates high-speed flight and grasping. The supplementary video is available for more results. Video: https://zikenhuang.github.io/Swooper/.
Paper Structure (17 sections, 4 equations, 8 figures, 2 tables)

This paper contains 17 sections, 4 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: The simulation environment and the illustration of the aerial grasping process.
  • Figure 2: To grasp a target object placed at an arbitrary position and orientation, the quadrotor has to rotate its yaw angle to align with that of the object during the approaching phase.
  • Figure 3: System overview. Our two-stage DRL-based approach first trains a flight control policy and then fine-tunes it to acquire gripper control, finally yielding a unified and lightweight aerial grasping policy. Each stage is guided by a tailored reward function. The policy network takes the current and desired states of the quadrotor as input, and outputs a CTBR command for flight control and a gripper control command. OTE refers to the Online Throttle Estimation module in Section \ref{['subsec:real_experiments']}.
  • Figure 4: Learning curves of policies trained with different settings. Each solid line indicates the mean performance across 5 training runs with different random seeds, and its shaded band represents the standard deviation. Note that the criteria for success differ between the two stages and the success rate of TFS refers to the grasp success rate.
  • Figure 5: Grasp success rate vs. grasping speed and relative object yaw angle. The relative object yaw angle also refers to the initial yaw error between the quadrotor and the object.
  • ...and 3 more figures