SWIFT-Nav: Stability-Aware Waypoint-Level TD3 with Fuzzy Arbitration for UAV Navigation in Cluttered Environments
Shuaidong Ji, Mahdi Bamdad, Francisco Cruz
TL;DR
SWIFT-Nav targets reliable UAV navigation in cluttered environments by merging a waypoint-level TD3 policy with stability-aware arbitration and a fuzzy safety layer. The system decouples planning from low-level control, uses PER and a decaying epsilon-greedy strategy, and employs a lightweight trajectory checker to regularize proposals, enabling fast, stable learning. Its three-mode arbitration (Travel, RL, Landing) with hysteresis, debounce, and dwell-time bounds mode switching, while a fuzzy safety score biases decisions to reduce oscillations. Demonstrated in Webots on Apple Silicon, SWIFT-Nav achieves smoother, shorter trajectories and robust generalization to unseen layouts, offering a deployable approach for real-time UAV navigation in cluttered scenes.
Abstract
Efficient and reliable UAV navigation in cluttered and dynamic environments remains challenging. We propose SWIFT-Nav: Stability-aware Waypoint-level Integration of Fuzzy arbitration and TD3 for Navigation, a TD3-based navigation framework that achieves fast, stable convergence to obstacle-aware paths. The system couples a sensor-driven perception front end with a TD3 waypoint policy: the perception module converts LiDAR ranges into a confidence-weighted safety map and goal cues, while the TD3 policy is trained with Prioritised Experience Replay to focus on high-error transitions and a decaying epsilon-greedy exploration schedule that gradually shifts from exploration to exploitation. A lightweight fuzzy-logic layer computes a safety score from radial measurements and near obstacles, gates mode switching and clamps unsafe actions; in parallel, task-aligned reward shaping combining goal progress, clearance, and switch-economy terms provides dense, well-scaled feedback that accelerates learning. Implemented in Webots with proximity-based collision checking, our approach consistently outperforms baselines in trajectory smoothness and generalization to unseen layouts, while preserving real-time responsiveness. These results show that combining TD3 with replay prioritisation, calibrated exploration, and fuzzy-safety rules yields a robust and deployable solution for UAV navigation in cluttered scenes.
