Table of Contents
Fetching ...

Reinforcement Learning for Enhancing Sensing Estimation in Bistatic ISAC Systems with UAV Swarms

Obed Morrison Atsu, Salmane Naoumi, Roberto Bomfin, Marwa Chafii

TL;DR

The paper addresses sensing performance in bistatic ISAC networks using a UAV swarm by formulating UAV placement and trajectory as a partially observable multi-agent problem with centralized training and decentralized execution. It proposes a cooperative MARL framework with emergent communication and a transmission power adaptation mechanism to mitigate interference and improve environmental awareness. Key contributions include a PO-MDP formulation for UAV path planning, a decentralized MARL architecture with LSTM-Attention for processing local observations and inter-agent messages, and a power-control strategy to maximize SINR under realistic channel conditions. The results demonstrate robust target coverage and high communication efficiency across varying environment sizes, numbers of UAVs, and target loads, indicating strong potential for scalable ISAC networks in future 6G scenarios.

Abstract

This paper introduces a novel Multi-Agent Reinforcement Learning (MARL) framework to enhance integrated sensing and communication (ISAC) networks using unmanned aerial vehicle (UAV) swarms as sensing radars. By framing the positioning and trajectory optimization of UAVs as a Partially Observable Markov Decision Process, we develop a MARL approach that leverages centralized training with decentralized execution to maximize the overall sensing performance. Specifically, we implement a decentralized cooperative MARL strategy to enable UAVs to develop effective communication protocols, therefore enhancing their environmental awareness and operational efficiency. Additionally, we augment the MARL solution with a transmission power adaptation technique to mitigate interference between the communicating drones and optimize the communication protocol efficiency. Moreover, a transmission power adaptation technique is incorporated to mitigate interference and optimize the learned communication protocol efficiency. Despite the increased complexity, our solution demonstrates robust performance and adaptability across various scenarios, providing a scalable and cost-effective enhancement for future ISAC networks.

Reinforcement Learning for Enhancing Sensing Estimation in Bistatic ISAC Systems with UAV Swarms

TL;DR

The paper addresses sensing performance in bistatic ISAC networks using a UAV swarm by formulating UAV placement and trajectory as a partially observable multi-agent problem with centralized training and decentralized execution. It proposes a cooperative MARL framework with emergent communication and a transmission power adaptation mechanism to mitigate interference and improve environmental awareness. Key contributions include a PO-MDP formulation for UAV path planning, a decentralized MARL architecture with LSTM-Attention for processing local observations and inter-agent messages, and a power-control strategy to maximize SINR under realistic channel conditions. The results demonstrate robust target coverage and high communication efficiency across varying environment sizes, numbers of UAVs, and target loads, indicating strong potential for scalable ISAC networks in future 6G scenarios.

Abstract

This paper introduces a novel Multi-Agent Reinforcement Learning (MARL) framework to enhance integrated sensing and communication (ISAC) networks using unmanned aerial vehicle (UAV) swarms as sensing radars. By framing the positioning and trajectory optimization of UAVs as a Partially Observable Markov Decision Process, we develop a MARL approach that leverages centralized training with decentralized execution to maximize the overall sensing performance. Specifically, we implement a decentralized cooperative MARL strategy to enable UAVs to develop effective communication protocols, therefore enhancing their environmental awareness and operational efficiency. Additionally, we augment the MARL solution with a transmission power adaptation technique to mitigate interference between the communicating drones and optimize the communication protocol efficiency. Moreover, a transmission power adaptation technique is incorporated to mitigate interference and optimize the learned communication protocol efficiency. Despite the increased complexity, our solution demonstrates robust performance and adaptability across various scenarios, providing a scalable and cost-effective enhancement for future ISAC networks.
Paper Structure (9 sections, 11 equations, 4 figures, 2 tables, 1 algorithm)

This paper contains 9 sections, 11 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: Illustration of the architecture of our proposed MARL algorithm with communication.
  • Figure 2: Evolution of the percentage of detected targets by the UAV swarm in environments with the number of targets $q$.
  • Figure 3: Episodic cumulative reward comparison in environments with increasing UAV number ($M$).
  • Figure 4: Inter-UAV communication efficiency as the percentage of messages exceeding the SINR threshold.