Turning Threat into Opportunity: DRL-Powered Anti-Jamming via Energy Harvesting in UAV-Disrupted Channels
Ngoc-Tan Nguyen, Thi-Thu Hoang, Trung-Dung Hoang, Thai-Duong Nguyen
TL;DR
The paper tackles intelligent UAV-based jamming in ambient backscatter communications by formulating a Markov Decision Process and solving it with a Deep Q-Network to enable a resource-constrained transmitter to switch between active transmission, energy harvesting, and backscattering. The proposed DRL approach leverages experience replay, a target network, and epsilon-greedy exploration to learn robust policies that outperform a heuristic greedy baseline and traditional Q-learning in throughput, reliability, and convergence speed. Key contributions include a comprehensive system model, an MDP formulation for mode switching under partial observability, and extensive simulations showing the DRL policy can turn jamming signals into exploitable energy or backscatter opportunities. This work advances anti-jamming techniques for AmBC in UAV-threatened environments and suggests future directions like multi-agent DRL and added security metrics for real-world deployments.
Abstract
The open and broadcast nature of wireless communication systems, while enabling ubiquitous connectivity, also exposes them to jamming attacks that may critically compromise network performance or disrupt service availability. The proliferation of Unmanned Aerial Vehicles (UAVs) introduces a new dimension to this threat, as UAVs can act as mobile, intelligent jammers capable of launching sophisticated attacks by leveraging Line-of-Sight (LoS) channels and adaptive strategies. This paper addresses a critical challenge of countering intelligent UAV jamming in the context of energy-constrained ambient backscatter communication systems. Traditional anti-jamming techniques often fall short against such dynamic threats or are unsuitable for low-power backscatter devices. Hence, we propose a novel anti-jamming framework based on Deep Reinforcement Learning (DRL) that empowers the transmitter to not only defend against but also strategically exploit the UAV's jamming signals. In particular, our approach allows the transmitter to learn an optimal policy for switching between active transmission, energy harvesting from the jamming signal, and backscattering information using the jammer's own emissions. We then formulate the problem as a Markov Decision Process (MDP) and employ a Deep Q-Network (DQN) to derive the optimal operational strategy. Simulation results demonstrate that our DQN-based method significantly outperforms conventional Q-learning in convergence speed and surpasses a greedy anti-jamming strategy in terms of average throughput, packet loss rate, and packet delivery ratio.
