Vision-Guided MPPI for Agile Drone Racing: Navigating Arbitrary Gate Poses via Neural Signed Distance Fields

Fangguo Zhao; Hanbing Zhang; Zhouheng Li; Xin Guan; Shuo Li

Vision-Guided MPPI for Agile Drone Racing: Navigating Arbitrary Gate Poses via Neural Signed Distance Fields

Fangguo Zhao, Hanbing Zhang, Zhouheng Li, Xin Guan, Shuo Li

TL;DR

This work proposes a fully onboard, vision guided optimal control framework that enables reference-free agile flight through arbitrarily placed and oriented gates and inherently maintains spatial consistency, ensuring robust navigation even under severe visual occlusion during aggressive maneuvers.

Abstract

Autonomous drone racing requires the tight coupling of perception, planning, and control under extreme agility. However, recent approaches typically rely on precomputed spatial reference trajectories or explicit 6-DoF gate pose estimation, rendering them brittle to spatial perturbations, unmodeled track changes, and sensor noise. Conversely, end-to-end learning policies frequently overfit to specific track layouts and struggle with zero-shot generalization. To address these fundamental limitations, we propose a fully onboard, vision guided optimal control framework that enables reference-free agile flight through arbitrarily placed and oriented gates. Central to our approach is Gate-SDF, a novel, implicitly learned neural signed distance field. Gate-SDF directly processes raw, noisy depth images to predict a continuous spatial field that provides both collision repulsion and active geometric guidance toward the valid traversal area. We seamlessly integrate this representation into a sampling-based Model Predictive Path Integral (MPPI) controller. By fully exploiting GPU parallelism, the framework evaluates these continuous spatial constraints across thousands of simulated trajectory rollouts simultaneously in real time. Furthermore, our formulation inherently maintains spatial consistency, ensuring robust navigation even under severe visual occlusion during aggressive maneuvers. Extensive simulations and real-world experiments demonstrate that the proposed system achieves high-speed agile flight and successfully navigates unseen tracks subject to severe unmodeled gate displacements and orientation perturbations. Videos are available at https://zhaofangguo.github.io/vision_guided_mppi/

Vision-Guided MPPI for Agile Drone Racing: Navigating Arbitrary Gate Poses via Neural Signed Distance Fields

TL;DR

Abstract

Paper Structure (18 sections, 10 equations, 10 figures, 2 tables)

This paper contains 18 sections, 10 equations, 10 figures, 2 tables.

Introduction
Related Work
Gate-Shaped Signed Distance Field
Gate-specific signed distance field (Gate-SDF)
Learning the Safety Region without External Gate Pose
Vision Guided Racing Controller via MPPI
Model Predictive Path Integral Control Framework
Cost for Drone Racing
Gate Progress Cost
Perception Alignment Cost
Gate-SDF Guided Safety Cost
Experiments
Simulation Experiments
Implementation Details
Gate-SDF Robustness and Spatial Consistency
...and 3 more sections

Figures (10)

Figure 1: Real-world experiments with gates arranged at varying positions and orientations. All computations are performed online by the onboard computer using depth images.
Figure 2: Overview of the proposed framework. At each control step, a depth encoder extracts a latent vector $\mathbf{z}$, which is duplicated $M \times K$ times ($M$ rollouts, $K$ horizon) and concatenated with each MPPI-sampled state $\mathbf{p}$. The SDF decoder evaluates these points to formulate vision guided safety constraints. Combined with a gate progress objective, the optimal control sequence is derived via cost-weighted averaging.
Figure 3: Dataset Generation Pipeline.
Figure 4: Overview of the two-stage training pipeline. In simulation, a denoising autoencoder is trained to robustly extract gate features from noisy images, alongside a SDF decoder. In the real-world domain, the image encoder is fine-tuned to adapt to specific sensor characteristics.
Figure 5: Predicted Gate-SDF visualization. The leftmost panel is the onboard depth image. Four horizontal slices (colored lines) are sampled, and their top-down SDF maps show that the model correctly captures the gate's traversable region.
...and 5 more figures

Vision-Guided MPPI for Agile Drone Racing: Navigating Arbitrary Gate Poses via Neural Signed Distance Fields

TL;DR

Abstract

Vision-Guided MPPI for Agile Drone Racing: Navigating Arbitrary Gate Poses via Neural Signed Distance Fields

Authors

TL;DR

Abstract

Table of Contents

Figures (10)