UAV Trajectory Optimization via Improved Noisy Deep Q-Network
Zhang Hengyu, Maryam Cheraghy, Liu Wei, Armin Farhadi, Meysam Soltanpour, Zhong Zhuoqing
TL;DR
This paper addresses UAV trajectory optimization under communication constraints by formulating a discrete, grid-based navigation task and solving it with an Improved Noisy DQN. The method blends residual NoisyLinear layers with adaptive noise scheduling, Double DQN target estimation, and soft target updates to enhance exploration, training stability, and sample efficiency. Key contributions include a learnable-noise network architecture, a performance-aware noise schedule with periodic resampling, and a comprehensive evaluation showing faster convergence, higher rewards, and fewer steps than standard DQN variants. The work advances practical, robust RL-based UAV navigation in cluttered, signal-limited environments with implications for real-time planning and reliability in communication-constrained scenarios.
Abstract
This paper proposes an Improved Noisy Deep Q-Network (Noisy DQN) to enhance the exploration and stability of Unmanned Aerial Vehicle (UAV) when applying deep reinforcement learning in simulated environments. This method enhances the exploration ability by combining the residual NoisyLinear layer with an adaptive noise scheduling mechanism, while improving training stability through smooth loss and soft target network updates. Experiments show that the proposed model achieves faster convergence and up to $+40$ higher rewards compared to standard DQN and quickly reach to the minimum number of steps required for the task 28 in the 15 * 15 grid navigation environment set up. The results show that our comprehensive improvements to the network structure of NoisyNet, exploration control, and training stability contribute to enhancing the efficiency and reliability of deep Q-learning.
