Table of Contents
Fetching ...

Curriculum Reinforcement Learning for Quadrotor Racing with Random Obstacles

Fangyu Sun, Fanxing Li, Yu Hu, Linzuo Zhang, Yueqian Liu, Wenxian Yu, Danping Zou

TL;DR

A novel vision-based curriculum reinforcement learning framework for training a robust controller capable of addressing unseen obstacles in drone racing, which combines multi-stageulum learning, domain randomization, and a multi-scene updating strategy to address the conflicting challenges of obstacle avoidance and gate traversal.

Abstract

Autonomous drone racing has attracted increasing interest as a research topic for exploring the limits of agile flight. However, existing studies primarily focus on obstacle-free racetracks, while the perception and dynamic challenges introduced by obstacles remain underexplored, often resulting in low success rates and limited robustness in real-world flight. To this end, we propose a novel vision-based curriculum reinforcement learning framework for training a robust controller capable of addressing unseen obstacles in drone racing. We combine multi-stage cu rriculum learning, domain randomization, and a multi-scene updating strategy to address the conflicting challenges of obstacle avoidance and gate traversal. Our end-to-end control policy is implemented as a single network, allowing high-speed flight of quadrotors in environments with variable obstacles. Both hardware-in-the-loop and real-world experiments demonstrate that our method achieves faster lap times and higher success rates than existing approaches, effectively advancing drone racing in obstacle-rich environments. The video and code are available at: https://github.com/SJTU-ViSYS-team/CRL-Drone-Racing.

Curriculum Reinforcement Learning for Quadrotor Racing with Random Obstacles

TL;DR

A novel vision-based curriculum reinforcement learning framework for training a robust controller capable of addressing unseen obstacles in drone racing, which combines multi-stageulum learning, domain randomization, and a multi-scene updating strategy to address the conflicting challenges of obstacle avoidance and gate traversal.

Abstract

Autonomous drone racing has attracted increasing interest as a research topic for exploring the limits of agile flight. However, existing studies primarily focus on obstacle-free racetracks, while the perception and dynamic challenges introduced by obstacles remain underexplored, often resulting in low success rates and limited robustness in real-world flight. To this end, we propose a novel vision-based curriculum reinforcement learning framework for training a robust controller capable of addressing unseen obstacles in drone racing. We combine multi-stage cu rriculum learning, domain randomization, and a multi-scene updating strategy to address the conflicting challenges of obstacle avoidance and gate traversal. Our end-to-end control policy is implemented as a single network, allowing high-speed flight of quadrotors in environments with variable obstacles. Both hardware-in-the-loop and real-world experiments demonstrate that our method achieves faster lap times and higher success rates than existing approaches, effectively advancing drone racing in obstacle-rich environments. The video and code are available at: https://github.com/SJTU-ViSYS-team/CRL-Drone-Racing.
Paper Structure (24 sections, 5 equations, 8 figures, 5 tables)

This paper contains 24 sections, 5 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Our quadrotor autonomously races with random obstacles in the real world at speeds of up to 8$m/s$. (a) The real-world experiment of the S-shaped racetrack and the trajectory. (b) Our onboard drone is equipped with a Raspberry Pi computer and an Intel D435i depth camera. (c) FPV depth images during racing with obstacles.
  • Figure 2: The framework of our RL policy training network for the obstacle-rich racing task. The network architecture consists solely of simple CNNs and MLPs without any complex backbones. "Cat" is the abbreviation for "Concatenate".
  • Figure 3: Balancing the tasks of gate passing and obstacle avoidance is challenging for training an RL policy. These two tasks are contradictory for reward design, the obstacle-avoidance reward would cause the drone to bypass the gates from the sides rather than passing through them directly.
  • Figure 4: Three different racetracks with trajectories flown by our policy in simulation. It can be observed that under our vision-based policy, the drone can successfully race in densely cluttered environments.
  • Figure 5: The framework of multi-scene updating. The multi-scene updating approach improves policy training efficiency by customizing the number of scenes in each agent group.
  • ...and 3 more figures