Table of Contents
Fetching ...

Trajectory Design for UAV-Based Low-Altitude Wireless Networks in Unknown Environments: A Digital Twin-Assisted TD3 Approach

Jihao Luo, Zesong Fei, Xinyi Wang, Le Zhao, Yuanhao Cui, Guangxu Zhu, Dusit Niyato

TL;DR

The paper tackles UAV trajectory design for low-altitude wireless networks in unknown environments by coupling a digital twin–assisted training framework with a SATD3TD trajectory design. It combines offline VE–driven learning in a DT with online VE updates to guide safe, efficient flight decisions, in a TDMA/ISAC setting. The main contributions are (i) the DTTDF that accelerates training and ensures safety, and (ii) the SATD3TD scheme that integrates simulated annealing for initial scheduling with TD3 for continuous trajectory optimization, yielding faster convergence, fewer collisions, and shorter mission times than baselines. The proposed approach enhances reliability and efficiency for LAWN deployments in uncertain settings, with potential to scale to larger UAV fleets and dynamic ground-user demands.

Abstract

Unmanned aerial vehicles (UAVs) are emerging as key enablers for low-altitude wireless network (LAWN), particularly when terrestrial networks are unavailable. In such scenarios, the environmental topology is typically unknown; hence, designing efficient and safe UAV trajectories is essential yet challenging. To address this, we propose a digital twin (DT)-assisted training and deployment framework. In this framework, the UAV transmits integrated sensing and communication signals to provide communication services to ground users, while simultaneously collecting echoes that are uploaded to the DT server to progressively construct virtual environments (VEs). These VEs accelerate model training and are continuously updated with real-time UAV sensing data during deployment, supporting decision-making and enhancing flight safety. Based on this framework, we further develop a trajectory design scheme that integrates simulated annealing for efficient user scheduling with the twin-delayed deep deterministic policy gradient algorithm for continuous trajectory design, aiming to minimize mission completion time while ensuring obstacle avoidance. Simulation results demonstrate that the proposed approach achieves faster convergence, higher flight safety, and shorter mission completion time compared with baseline methods, providing a robust and efficient solution for LAWN deployment in unknown environments.

Trajectory Design for UAV-Based Low-Altitude Wireless Networks in Unknown Environments: A Digital Twin-Assisted TD3 Approach

TL;DR

The paper tackles UAV trajectory design for low-altitude wireless networks in unknown environments by coupling a digital twin–assisted training framework with a SATD3TD trajectory design. It combines offline VE–driven learning in a DT with online VE updates to guide safe, efficient flight decisions, in a TDMA/ISAC setting. The main contributions are (i) the DTTDF that accelerates training and ensures safety, and (ii) the SATD3TD scheme that integrates simulated annealing for initial scheduling with TD3 for continuous trajectory optimization, yielding faster convergence, fewer collisions, and shorter mission times than baselines. The proposed approach enhances reliability and efficiency for LAWN deployments in uncertain settings, with potential to scale to larger UAV fleets and dynamic ground-user demands.

Abstract

Unmanned aerial vehicles (UAVs) are emerging as key enablers for low-altitude wireless network (LAWN), particularly when terrestrial networks are unavailable. In such scenarios, the environmental topology is typically unknown; hence, designing efficient and safe UAV trajectories is essential yet challenging. To address this, we propose a digital twin (DT)-assisted training and deployment framework. In this framework, the UAV transmits integrated sensing and communication signals to provide communication services to ground users, while simultaneously collecting echoes that are uploaded to the DT server to progressively construct virtual environments (VEs). These VEs accelerate model training and are continuously updated with real-time UAV sensing data during deployment, supporting decision-making and enhancing flight safety. Based on this framework, we further develop a trajectory design scheme that integrates simulated annealing for efficient user scheduling with the twin-delayed deep deterministic policy gradient algorithm for continuous trajectory design, aiming to minimize mission completion time while ensuring obstacle avoidance. Simulation results demonstrate that the proposed approach achieves faster convergence, higher flight safety, and shorter mission completion time compared with baseline methods, providing a robust and efficient solution for LAWN deployment in unknown environments.

Paper Structure

This paper contains 24 sections, 21 equations, 11 figures, 2 tables, 2 algorithms.

Figures (11)

  • Figure 1: An illustration of UAV-based LAWN deployment system in an unknown environment. The UAV provides communication services to GUs while performing environment sensing, and the DT server leverages the sensing feedback to update the VE and generate flight policies. The top-left inset depicts the UAV’s movement and orientation in a 3D coordinate system.
  • Figure 2: An illustration of the time slot protocol for the mission. Each slot is divided into three sub-slots that complete a full cycle of operations, namely BS-UAV control and data transmission, UAV-GU service with ISAC signals, and UAV-BS sensing feedback, thereby enabling both communication and DT-driven trajectory updates.
  • Figure 3: An illustration of the DTTDF and UAV control mechanism based on the TD3 framework. The left side shows the DTTDF, where UAV sensing data update a VE in the DT layer, which then generates control signals to guide the UAV’s trajectory. The right side depicts the TD3 framework, which leverages state observations from the DT to derive the UAV’s flight policy via twin actor-critic networks, experience replay, soft target updates, etc.
  • Figure 4: An illustration of the state space with two components. (a) $\mathbf{S}^t_1$ for service delivery, where the APF attracts the UAV toward GUs according to their priority. The UAV’s communication range is indicated by the dashed circle. (b) $\mathbf{S}^t_2$ for obstacle avoidance, showing detected buildings along with the UAV’s position and sensing range, providing information for safe navigation.
  • Figure 5: Comparison of the path length over iterations for random scheduling, classical annealing, and the proposed algorithm.
  • ...and 6 more figures