Achieving Hiding and Smart Anti-Jamming Communication: A Parallel DRL Approach against Moving Reactive Jammer
Yangyang Li, Yuhua Xu, Wen Li, Guoxin Li, Zhibing Feng, Songyi Liu, Jiatao Du, Xinran Li
TL;DR
This work tackles anti-jamming under moving reactive jammers by jointly optimizing frequency agility and spread-spectrum parameters using a parallel DRL framework. By decomposing the action space into two parallel networks (frequency and spreading factor) atop a CNN backbone, the method mitigates the curse of dimensionality and accelerates convergence without relying on $ ext{ε}$-greedy exploration. The design enforces coordinated decisions through interconnected rewards and uses soft-target updates to stabilize learning, achieving about a 90% enhancement in normalized throughput relative to baselines. The approach demonstrates robustness across channel models and jamming patterns, offering a practical pathway for real-time anti-jamming in adversarial wireless environments.
Abstract
This paper addresses the challenge of anti-jamming in moving reactive jamming scenarios. The moving reactive jammer initiates high-power tracking jamming upon detecting any transmission activity, and when unable to detect a signal, resorts to indiscriminate jamming. This presents dual imperatives: maintaining hiding to avoid the jammer's detection and simultaneously evading indiscriminate jamming. Spread spectrum techniques effectively reduce transmitting power to elude detection but fall short in countering indiscriminate jamming. Conversely, changing communication frequencies can help evade indiscriminate jamming but makes the transmission vulnerable to tracking jamming without spread spectrum techniques to remain hidden. Current methodologies struggle with the complexity of simultaneously optimizing these two requirements due to the expansive joint action spaces and the dynamics of moving reactive jammers. To address these challenges, we propose a parallelized deep reinforcement learning (DRL) strategy. The approach includes a parallelized network architecture designed to decompose the action space. A parallel exploration-exploitation selection mechanism replaces the $\varepsilon $-greedy mechanism, accelerating convergence. Simulations demonstrate a nearly 90\% increase in normalized throughput.
