Table of Contents
Fetching ...

Integrated Sensing and Communications for Low-Altitude Economy: A Deep Reinforcement Learning Approach

Xiaowen Ye, Yuyi Mao, Xianghao Yu, Shu Sun, Liqun Fu, Jie Xu

TL;DR

A novel LAE-oriented ISAC scheme, referred to as Deep LAE-ISAC (DeepLSC), is proposed, and a symmetric experience augmentation mechanism, which simultaneously permutes the indexes of all variables to enrich available experience sets, is proposed to enhance the convergence speed of DeepLSC.

Abstract

This paper studies an integrated sensing and communications (ISAC) system for low-altitude economy (LAE), where a ground base station (GBS) provides communication and navigation services for authorized unmanned aerial vehicles (UAVs), while sensing the low-altitude airspace to monitor the unauthorized mobile target. The expected communication sum-rate over a given flight period is maximized by jointly optimizing the beamforming at the GBS and UAVs' trajectories, subject to the constraints on the average signal-to-noise ratio requirement for sensing, the flight mission and collision avoidance of UAVs, as well as the maximum transmit power at the GBS. Typically, this is a sequential decision-making problem with the given flight mission. Thus, we transform it to a specific Markov decision process (MDP) model called episode task. Based on this modeling, we propose a novel LAE-oriented ISAC scheme, referred to as Deep LAE-ISAC (DeepLSC), by leveraging the deep reinforcement learning (DRL) technique. In DeepLSC, a reward function and a new action selection policy termed constrained noise-exploration policy are judiciously designed to fulfill various constraints. To enable efficient learning in episode tasks, we develop a hierarchical experience replay mechanism, where the gist is to employ all experiences generated within each episode to jointly train the neural network. Besides, to enhance the convergence speed of DeepLSC, a symmetric experience augmentation mechanism, which simultaneously permutes the indexes of all variables to enrich available experience sets, is proposed. Simulation results demonstrate that compared with benchmarks, DeepLSC yields a higher sum-rate while meeting the preset constraints, achieves faster convergence, and is more robust against different settings.

Integrated Sensing and Communications for Low-Altitude Economy: A Deep Reinforcement Learning Approach

TL;DR

A novel LAE-oriented ISAC scheme, referred to as Deep LAE-ISAC (DeepLSC), is proposed, and a symmetric experience augmentation mechanism, which simultaneously permutes the indexes of all variables to enrich available experience sets, is proposed to enhance the convergence speed of DeepLSC.

Abstract

This paper studies an integrated sensing and communications (ISAC) system for low-altitude economy (LAE), where a ground base station (GBS) provides communication and navigation services for authorized unmanned aerial vehicles (UAVs), while sensing the low-altitude airspace to monitor the unauthorized mobile target. The expected communication sum-rate over a given flight period is maximized by jointly optimizing the beamforming at the GBS and UAVs' trajectories, subject to the constraints on the average signal-to-noise ratio requirement for sensing, the flight mission and collision avoidance of UAVs, as well as the maximum transmit power at the GBS. Typically, this is a sequential decision-making problem with the given flight mission. Thus, we transform it to a specific Markov decision process (MDP) model called episode task. Based on this modeling, we propose a novel LAE-oriented ISAC scheme, referred to as Deep LAE-ISAC (DeepLSC), by leveraging the deep reinforcement learning (DRL) technique. In DeepLSC, a reward function and a new action selection policy termed constrained noise-exploration policy are judiciously designed to fulfill various constraints. To enable efficient learning in episode tasks, we develop a hierarchical experience replay mechanism, where the gist is to employ all experiences generated within each episode to jointly train the neural network. Besides, to enhance the convergence speed of DeepLSC, a symmetric experience augmentation mechanism, which simultaneously permutes the indexes of all variables to enrich available experience sets, is proposed. Simulation results demonstrate that compared with benchmarks, DeepLSC yields a higher sum-rate while meeting the preset constraints, achieves faster convergence, and is more robust against different settings.

Paper Structure

This paper contains 27 sections, 30 equations, 6 figures, 4 tables, 1 algorithm.

Figures (6)

  • Figure 1: LAE-oriented ISAC systems.
  • Figure 2: DeepLSC framework, including the execution phase and the training phase. During the execution phase, based on $\mathbf{S}(t)$, the eval-actor outputs the temporary joint beamforming and trajectory decision ${\pi}_{\text{a}}(\mathbf{S}(t),\mathbf{\Theta}_{\text{a}})$, which is refined by the constrained noise-exploration policy. During the training phase, some experience sets sampled from the experience buffer are augmented to form the mini-batch for training the actor-critic architecture.
  • Figure 3: Communication sum-rate achieved by various schemes.
  • Figure 4: Flight trajectory of a specific UAV under various schemes.
  • Figure 5: Sum-rate of various schemes under different numbers of UAVs.
  • ...and 1 more figures