Table of Contents
Fetching ...

Vision-Guided MPPI for Agile Drone Racing: Navigating Arbitrary Gate Poses via Neural Signed Distance Fields

Fangguo Zhao, Hanbing Zhang, Zhouheng Li, Xin Guan, Shuo Li

TL;DR

This work proposes a fully onboard, vision guided optimal control framework that enables reference-free agile flight through arbitrarily placed and oriented gates and inherently maintains spatial consistency, ensuring robust navigation even under severe visual occlusion during aggressive maneuvers.

Abstract

Autonomous drone racing requires the tight coupling of perception, planning, and control under extreme agility. However, recent approaches typically rely on precomputed spatial reference trajectories or explicit 6-DoF gate pose estimation, rendering them brittle to spatial perturbations, unmodeled track changes, and sensor noise. Conversely, end-to-end learning policies frequently overfit to specific track layouts and struggle with zero-shot generalization. To address these fundamental limitations, we propose a fully onboard, vision guided optimal control framework that enables reference-free agile flight through arbitrarily placed and oriented gates. Central to our approach is Gate-SDF, a novel, implicitly learned neural signed distance field. Gate-SDF directly processes raw, noisy depth images to predict a continuous spatial field that provides both collision repulsion and active geometric guidance toward the valid traversal area. We seamlessly integrate this representation into a sampling-based Model Predictive Path Integral (MPPI) controller. By fully exploiting GPU parallelism, the framework evaluates these continuous spatial constraints across thousands of simulated trajectory rollouts simultaneously in real time. Furthermore, our formulation inherently maintains spatial consistency, ensuring robust navigation even under severe visual occlusion during aggressive maneuvers. Extensive simulations and real-world experiments demonstrate that the proposed system achieves high-speed agile flight and successfully navigates unseen tracks subject to severe unmodeled gate displacements and orientation perturbations. Videos are available at https://zhaofangguo.github.io/vision_guided_mppi/

Vision-Guided MPPI for Agile Drone Racing: Navigating Arbitrary Gate Poses via Neural Signed Distance Fields

TL;DR

This work proposes a fully onboard, vision guided optimal control framework that enables reference-free agile flight through arbitrarily placed and oriented gates and inherently maintains spatial consistency, ensuring robust navigation even under severe visual occlusion during aggressive maneuvers.

Abstract

Autonomous drone racing requires the tight coupling of perception, planning, and control under extreme agility. However, recent approaches typically rely on precomputed spatial reference trajectories or explicit 6-DoF gate pose estimation, rendering them brittle to spatial perturbations, unmodeled track changes, and sensor noise. Conversely, end-to-end learning policies frequently overfit to specific track layouts and struggle with zero-shot generalization. To address these fundamental limitations, we propose a fully onboard, vision guided optimal control framework that enables reference-free agile flight through arbitrarily placed and oriented gates. Central to our approach is Gate-SDF, a novel, implicitly learned neural signed distance field. Gate-SDF directly processes raw, noisy depth images to predict a continuous spatial field that provides both collision repulsion and active geometric guidance toward the valid traversal area. We seamlessly integrate this representation into a sampling-based Model Predictive Path Integral (MPPI) controller. By fully exploiting GPU parallelism, the framework evaluates these continuous spatial constraints across thousands of simulated trajectory rollouts simultaneously in real time. Furthermore, our formulation inherently maintains spatial consistency, ensuring robust navigation even under severe visual occlusion during aggressive maneuvers. Extensive simulations and real-world experiments demonstrate that the proposed system achieves high-speed agile flight and successfully navigates unseen tracks subject to severe unmodeled gate displacements and orientation perturbations. Videos are available at https://zhaofangguo.github.io/vision_guided_mppi/
Paper Structure (18 sections, 10 equations, 10 figures, 2 tables)

This paper contains 18 sections, 10 equations, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Real-world experiments with gates arranged at varying positions and orientations. All computations are performed online by the onboard computer using depth images.
  • Figure 2: Overview of the proposed framework. At each control step, a depth encoder extracts a latent vector $\mathbf{z}$, which is duplicated $M \times K$ times ($M$ rollouts, $K$ horizon) and concatenated with each MPPI-sampled state $\mathbf{p}$. The SDF decoder evaluates these points to formulate vision guided safety constraints. Combined with a gate progress objective, the optimal control sequence is derived via cost-weighted averaging.
  • Figure 3: Dataset Generation Pipeline.
  • Figure 4: Overview of the two-stage training pipeline. In simulation, a denoising autoencoder is trained to robustly extract gate features from noisy images, alongside a SDF decoder. In the real-world domain, the image encoder is fine-tuned to adapt to specific sensor characteristics.
  • Figure 5: Predicted Gate-SDF visualization. The leftmost panel is the onboard depth image. Four horizontal slices (colored lines) are sampled, and their top-down SDF maps show that the model correctly captures the gate's traversable region.
  • ...and 5 more figures