Table of Contents
Fetching ...

SWIFT-Nav: Stability-Aware Waypoint-Level TD3 with Fuzzy Arbitration for UAV Navigation in Cluttered Environments

Shuaidong Ji, Mahdi Bamdad, Francisco Cruz

TL;DR

SWIFT-Nav targets reliable UAV navigation in cluttered environments by merging a waypoint-level TD3 policy with stability-aware arbitration and a fuzzy safety layer. The system decouples planning from low-level control, uses PER and a decaying epsilon-greedy strategy, and employs a lightweight trajectory checker to regularize proposals, enabling fast, stable learning. Its three-mode arbitration (Travel, RL, Landing) with hysteresis, debounce, and dwell-time bounds mode switching, while a fuzzy safety score biases decisions to reduce oscillations. Demonstrated in Webots on Apple Silicon, SWIFT-Nav achieves smoother, shorter trajectories and robust generalization to unseen layouts, offering a deployable approach for real-time UAV navigation in cluttered scenes.

Abstract

Efficient and reliable UAV navigation in cluttered and dynamic environments remains challenging. We propose SWIFT-Nav: Stability-aware Waypoint-level Integration of Fuzzy arbitration and TD3 for Navigation, a TD3-based navigation framework that achieves fast, stable convergence to obstacle-aware paths. The system couples a sensor-driven perception front end with a TD3 waypoint policy: the perception module converts LiDAR ranges into a confidence-weighted safety map and goal cues, while the TD3 policy is trained with Prioritised Experience Replay to focus on high-error transitions and a decaying epsilon-greedy exploration schedule that gradually shifts from exploration to exploitation. A lightweight fuzzy-logic layer computes a safety score from radial measurements and near obstacles, gates mode switching and clamps unsafe actions; in parallel, task-aligned reward shaping combining goal progress, clearance, and switch-economy terms provides dense, well-scaled feedback that accelerates learning. Implemented in Webots with proximity-based collision checking, our approach consistently outperforms baselines in trajectory smoothness and generalization to unseen layouts, while preserving real-time responsiveness. These results show that combining TD3 with replay prioritisation, calibrated exploration, and fuzzy-safety rules yields a robust and deployable solution for UAV navigation in cluttered scenes.

SWIFT-Nav: Stability-Aware Waypoint-Level TD3 with Fuzzy Arbitration for UAV Navigation in Cluttered Environments

TL;DR

SWIFT-Nav targets reliable UAV navigation in cluttered environments by merging a waypoint-level TD3 policy with stability-aware arbitration and a fuzzy safety layer. The system decouples planning from low-level control, uses PER and a decaying epsilon-greedy strategy, and employs a lightweight trajectory checker to regularize proposals, enabling fast, stable learning. Its three-mode arbitration (Travel, RL, Landing) with hysteresis, debounce, and dwell-time bounds mode switching, while a fuzzy safety score biases decisions to reduce oscillations. Demonstrated in Webots on Apple Silicon, SWIFT-Nav achieves smoother, shorter trajectories and robust generalization to unseen layouts, offering a deployable approach for real-time UAV navigation in cluttered scenes.

Abstract

Efficient and reliable UAV navigation in cluttered and dynamic environments remains challenging. We propose SWIFT-Nav: Stability-aware Waypoint-level Integration of Fuzzy arbitration and TD3 for Navigation, a TD3-based navigation framework that achieves fast, stable convergence to obstacle-aware paths. The system couples a sensor-driven perception front end with a TD3 waypoint policy: the perception module converts LiDAR ranges into a confidence-weighted safety map and goal cues, while the TD3 policy is trained with Prioritised Experience Replay to focus on high-error transitions and a decaying epsilon-greedy exploration schedule that gradually shifts from exploration to exploitation. A lightweight fuzzy-logic layer computes a safety score from radial measurements and near obstacles, gates mode switching and clamps unsafe actions; in parallel, task-aligned reward shaping combining goal progress, clearance, and switch-economy terms provides dense, well-scaled feedback that accelerates learning. Implemented in Webots with proximity-based collision checking, our approach consistently outperforms baselines in trajectory smoothness and generalization to unseen layouts, while preserving real-time responsiveness. These results show that combining TD3 with replay prioritisation, calibrated exploration, and fuzzy-safety rules yields a robust and deployable solution for UAV navigation in cluttered scenes.

Paper Structure

This paper contains 44 sections, 24 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: SWIFT-Nav overview with abstracted steps and explicit mode transitions. Flow starts atTravel mode (arrow) where the UAV follows a guidance line. When the risk trigger fires, control switches to RL mode to propose/score waypoints; either mode transitions to Landing mode once the goal tolerance is met, after which the episode ends.
  • Figure 2: Directional weights for the fuzzy safety score. Front sectors ($0^\circ$–$30^\circ$ and $330^\circ$–$360^\circ$): $w{=}1.0$; side sectors ($30^\circ$–$150^\circ$, $210^\circ$–$330^\circ$): $w{=}0.5$; rear ($150^\circ$–$210^\circ$): $w{=}0.2$. Higher weight = higher directional importance.
  • Figure 3: DJI Mavic 2 Pro in Webots.
  • Figure 4: Simulation maps used for evaluation (Trajectory 1 vs. Trajectory 2).
  • Figure 5: Comparing trajectories on Trajectory 1: SWIFT-Nav produces a shorter, smoother route with fewer detours, whereas the baseline path deviates more around clutter.
  • ...and 4 more figures