Table of Contents
Fetching ...

Vector Field Augmented Differentiable Policy Learning for Vision-Based Drone Racing

Yang Su, Feng Yu, Yu Hu, Xinze Niu, Linzuo Zhang, Fangyu Sun, Danping Zou

TL;DR

DiffRacing is proposed, a novel vector field-augmented differentiable policy learning framework that achieves superior sample efficiency, faster convergence, and robust flight performance, thereby demonstrating that vector fields can augment traditional gradient-based policy learning with a task-specific geometric prior.

Abstract

Autonomous drone racing in complex environments requires agile, high-speed flight while maintaining reliable obstacle avoidance. Differentiable-physics-based policy learning has recently demonstrated high sample efficiency and remarkable performance across various tasks, including agile drone flight and quadruped locomotion. However, applying such methods to drone racing remains difficult, as key objective like gate traversal are inherently hard to express as smooth, differentiable losses. To address these challenges, we propose DiffRacing, a novel vector field-augmented differentiable policy learning framework. DiffRacing integrates differentiable losses and vector fields into the training process to provide continuous and stable gradient signals, balancing obstacle avoidance and high-speed gate traversal. In addition, a differentiable Delta Action Model compensates for dynamics mismatch, enabling efficient sim-to-real transfer without explicit system identification. Extensive simulation and real-world experiments demonstrate that DiffRacing achieves superior sample efficiency, faster convergence, and robust flight performance, thereby demonstrating that vector fields can augment traditional gradient-based policy learning with a task-specific geometric prior.

Vector Field Augmented Differentiable Policy Learning for Vision-Based Drone Racing

TL;DR

DiffRacing is proposed, a novel vector field-augmented differentiable policy learning framework that achieves superior sample efficiency, faster convergence, and robust flight performance, thereby demonstrating that vector fields can augment traditional gradient-based policy learning with a task-specific geometric prior.

Abstract

Autonomous drone racing in complex environments requires agile, high-speed flight while maintaining reliable obstacle avoidance. Differentiable-physics-based policy learning has recently demonstrated high sample efficiency and remarkable performance across various tasks, including agile drone flight and quadruped locomotion. However, applying such methods to drone racing remains difficult, as key objective like gate traversal are inherently hard to express as smooth, differentiable losses. To address these challenges, we propose DiffRacing, a novel vector field-augmented differentiable policy learning framework. DiffRacing integrates differentiable losses and vector fields into the training process to provide continuous and stable gradient signals, balancing obstacle avoidance and high-speed gate traversal. In addition, a differentiable Delta Action Model compensates for dynamics mismatch, enabling efficient sim-to-real transfer without explicit system identification. Extensive simulation and real-world experiments demonstrate that DiffRacing achieves superior sample efficiency, faster convergence, and robust flight performance, thereby demonstrating that vector fields can augment traditional gradient-based policy learning with a task-specific geometric prior.
Paper Structure (17 sections, 18 equations, 7 figures, 3 tables)

This paper contains 17 sections, 18 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Compared to typical differentiable-dynamics-based methods that rely solely on dense differentiable loss functions to provide gradients, our framework integrates Attractive Vector Fields as a geometric prior for gate traversal, alongside standard loss functions to ensure safety.
  • Figure 2: Overview of the DiffRacing Framework. The policy network ($\pi$) takes depth and state observations as input. The core components are: the differentiable simulator, which allows loss gradients to back-propagate to the network; the Attractive Vector Field module, which augments policy gradients during training. The detailed augmentation mechanism is provided in Sec. \ref{['sec:vector_field_aug_policy_learning']}; and the Delta Action Model, which is trained to compensate for dynamics mismatch. Dashed lines indicate gradient flow. The orange lines represent the data flow during policy training, where the entire pipeline—including UAV dynamics rollout, image rendering, loss computation, and gradient backpropagation—is carried out fully within the simulator. The blue lines represent the data flow described in Sec. \ref{['sec:delta_action']}, which includes data collection, Delta Action Model training, and subsequent policy fine-tuning.
  • Figure 3: 3D Magnetic field visualization: (a) Top view of magnetic field; (b) Axonometric view of magnetic field.
  • Figure 4: An Intuitive Illustration of Trajectory Self-Correction (Top View): From left to right, an overshooting trajectory is gradually refined under the continuous guidance of the Attractive Vector Fields.
  • Figure 5: Performance comparison between different training schemes.
  • ...and 2 more figures