Table of Contents
Fetching ...

Performance-Guided Refinement for Visual Aerial Navigation using Editable Gaussian Splatting in FalconGym 2.0

Yan Miao, Ege Yuceel, Georgios Fainekos, Bardh Hoxha, Hideki Okamoto, Sayan Mitra

TL;DR

This work proposes a Performance-Guided Refinement (PGR) algorithm, which concentrates visual policy's training on challenging tracks while iteratively improving its performance, and shows that a single visual policy trained with PGR in FalconGym 2.0 outperforms state-of-the-art baselines in generalization and robustness.

Abstract

Visual policy design is crucial for aerial navigation. However, state-of-the-art visual policies often overfit to a single track and their performance degrades when track geometry changes. We develop FalconGym 2.0, a photorealistic simulation framework built on Gaussian Splatting (GSplat) with an Edit API that programmatically generates diverse static and dynamic tracks in milliseconds. Leveraging FalconGym 2.0's editability, we propose a Performance-Guided Refinement (PGR) algorithm, which concentrates visual policy's training on challenging tracks while iteratively improving its performance. Across two case studies (fixed-wing UAVs and quadrotors) with distinct dynamics and environments, we show that a single visual policy trained with PGR in FalconGym 2.0 outperforms state-of-the-art baselines in generalization and robustness: it generalizes to three unseen tracks with 100% success without per-track retraining and maintains higher success rates under gate-pose perturbations. Finally, we demonstrate that the visual policy trained with PGR in FalconGym 2.0 can be zero-shot sim-to-real transferred to a quadrotor hardware, achieving a 98.6% success rate (69 / 70 gates) over 30 trials spanning two three-gate tracks and a moving-gate track.

Performance-Guided Refinement for Visual Aerial Navigation using Editable Gaussian Splatting in FalconGym 2.0

TL;DR

This work proposes a Performance-Guided Refinement (PGR) algorithm, which concentrates visual policy's training on challenging tracks while iteratively improving its performance, and shows that a single visual policy trained with PGR in FalconGym 2.0 outperforms state-of-the-art baselines in generalization and robustness.

Abstract

Visual policy design is crucial for aerial navigation. However, state-of-the-art visual policies often overfit to a single track and their performance degrades when track geometry changes. We develop FalconGym 2.0, a photorealistic simulation framework built on Gaussian Splatting (GSplat) with an Edit API that programmatically generates diverse static and dynamic tracks in milliseconds. Leveraging FalconGym 2.0's editability, we propose a Performance-Guided Refinement (PGR) algorithm, which concentrates visual policy's training on challenging tracks while iteratively improving its performance. Across two case studies (fixed-wing UAVs and quadrotors) with distinct dynamics and environments, we show that a single visual policy trained with PGR in FalconGym 2.0 outperforms state-of-the-art baselines in generalization and robustness: it generalizes to three unseen tracks with 100% success without per-track retraining and maintains higher success rates under gate-pose perturbations. Finally, we demonstrate that the visual policy trained with PGR in FalconGym 2.0 can be zero-shot sim-to-real transferred to a quadrotor hardware, achieving a 98.6% success rate (69 / 70 gates) over 30 trials spanning two three-gate tracks and a moving-gate track.

Paper Structure

This paper contains 19 sections, 1 equation, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: Trajectories for UAV case study in FalconGym 2.0 across three unseen tracks (Spatial-S, Random and Moving). Top row: red overlays visualize predicted gate masks (Section \ref{['sec:vision-controller-architecture']}). Bottom row: 10 trials per track from different initial states; the translucent gates in the Moving track show the gate's past positions.
  • Figure 2: Closed-loop system in FalconGym 2.0: we provide dynamics for a fixed-wing UAV and a quadrotor, and a GSplat renderer that produces photorealistic RGB from arbitrary camera poses in either scene. At each timestep, the dynamics propagate the state, the renderer generates an RGB image, a perception module predicts a gate mask, and a controller consumes the mask plus past actions to predict the next action. During training, a Performance-Guided Refinement (PGR) algorithm (Section \ref{['sec:min-max-optimization']}) focuses training on challenging tracks generated using Edit API (Section \ref{['sec:editable-gsplat']})
  • Figure 3: Edit API in FalconGym 2.0. Our Edit API (Section \ref{['sec:editable-gsplat']}) provides world-frame programmatic placement of objects while the backend handles all coordinates and camera-to-world transform. The seven API: add, translate, rotate, scale, duplicate, delete, and lighting, allow users to modify object pose, size, and appearance to generate a photorealistic 4D simulation environment. Shown are gates edits across two environments. This editable capability enables PGR algorithm to improve visual policy (Section \ref{['sec:min-max-optimization']}).
  • Figure 4: Trajectories for quadrotor case study in FalconGym 2.0 across three unseen tracks (Left-Turn, Random and Moving).
  • Figure 5: Robustness to gate-pose perturbations on Spatial-S track in FalconGym 2.0. For a perturbation level $a$ cm, each gate is independently shifted by a random 3D offset $\delta \in [-a,a]^3$. For each perturbation level, we run all five policies on 10 randomized tracks (50 gates total) and report the Success Rate (SR).
  • ...and 2 more figures