How to Combat Reactive and Dynamic Jamming Attacks with Reinforcement Learning
Yalin E. Sagduyu, Tugba Erpek, Kemal Davaslioglu, Sastry Kompella
TL;DR
This work addresses the challenge of reactive and dynamic jamming by formulating transmitter–receiver interaction as a Markov decision process and applying model-free reinforcement learning to adapt transmission parameters. It employs Q-learning for discrete-state scenarios and Deep Q-Networks (DQN) for continuous-state observations, enabling adaptation of power, modulation, and channel selection to maximize throughput, with rewards anchored to Shannon throughput $R = \log_2(1 + SINR)$ (and modulation-specific $r_t = \log_2(M(t))\,(1 - BER(M(t); SINR_t))$ in PCAM). The results demonstrate rapid learning and resilient performance under evolving jamming strategies, with multi-channel channel hopping further improving outcomes. The study provides a practical RL-based anti-jamming framework applicable to both commercial and tactical wireless networks, highlighting trade-offs between discrete and continuous state representations and between single- versus multi-channel setups.
Abstract
This paper studies the problem of mitigating reactive jamming, where a jammer adopts a dynamic policy of selecting channels and sensing thresholds to detect and jam ongoing transmissions. The transmitter-receiver pair learns to avoid jamming and optimize throughput over time (without prior knowledge of channel conditions or jamming strategies) by using reinforcement learning (RL) to adapt transmit power, modulation, and channel selection. Q-learning is employed for discrete jamming-event states, while Deep Q-Networks (DQN) are employed for continuous states based on received power. Through different reward functions and action sets, the results show that RL can adapt rapidly to spectrum dynamics and sustain high rates as channels and jamming policies change over time.
