Table of Contents
Fetching ...

An End-to-end Flight Control Network for High-speed UAV Obstacle Avoidance based on Event-Depth Fusion

Dikai Shang, Jingyue Zhao, Shi Xu, Nanyang Ye, Lei Wang

Abstract

Achieving safe, high-speed autonomous flight in complex environments with static, dynamic, or mixed obstacles remains challenging, as a single perception modality is incomplete. Depth cameras are effective for static objects but suffer from motion blur at high speeds. Conversely, event cameras excel at capturing rapid motion but struggle to perceive static scenes. To exploit the complementary strengths of both sensors, we propose an end-to-end flight control network that achieves feature-level fusion of depth images and event data through a bidirectional crossattention module. The end-to-end network is trained via imitation learning, which relies on high-quality supervision. Building on this insight, we design an efficient expert planner using Spherical Principal Search (SPS). This planner reduces computational complexity from $O(n^2)$ to $O(n)$ while generating smoother trajectories, achieving over 80% success rate at 17m/s--nearly 20% higher than traditional planners. Simulation experiments show that our method attains a 70-80% success rate at 17 m/s across varied scenes, surpassing single-modality and unidirectional fusion models by 10-20%. These results demonstrate that bidirectional fusion effectively integrates event and depth information, enabling more reliable obstacle avoidance in complex environments with both static and dynamic objects.

An End-to-end Flight Control Network for High-speed UAV Obstacle Avoidance based on Event-Depth Fusion

Abstract

Achieving safe, high-speed autonomous flight in complex environments with static, dynamic, or mixed obstacles remains challenging, as a single perception modality is incomplete. Depth cameras are effective for static objects but suffer from motion blur at high speeds. Conversely, event cameras excel at capturing rapid motion but struggle to perceive static scenes. To exploit the complementary strengths of both sensors, we propose an end-to-end flight control network that achieves feature-level fusion of depth images and event data through a bidirectional crossattention module. The end-to-end network is trained via imitation learning, which relies on high-quality supervision. Building on this insight, we design an efficient expert planner using Spherical Principal Search (SPS). This planner reduces computational complexity from to while generating smoother trajectories, achieving over 80% success rate at 17m/s--nearly 20% higher than traditional planners. Simulation experiments show that our method attains a 70-80% success rate at 17 m/s across varied scenes, surpassing single-modality and unidirectional fusion models by 10-20%. These results demonstrate that bidirectional fusion effectively integrates event and depth information, enabling more reliable obstacle avoidance in complex environments with both static and dynamic objects.

Paper Structure

This paper contains 11 sections, 7 equations, 10 figures.

Figures (10)

  • Figure 1: Overall model architecture. The expert planner generates optimal trajectories using global environmental information, while the student network learns end-to-end control from local sensor inputs (depth images and event data) and the UAV’s state.
  • Figure 2: Comparison of path search strategies. (a) Traditional planar grid search; (b) Proposed SPS.
  • Figure 3: Overall student network architecture
  • Figure 4: Schematic Diagrams of Scenarios with Various Obstacles: (a) Scenarios with Trees; (b) Scenarios with Static Spherical Obstacles; (c) Scenarios with Mixed (Static and Dynamic) Spherical Obstacles;
  • Figure 5: Schematic diagram of the bidirectional attention module
  • ...and 5 more figures