A DRL-Empowered Multi-Level Jamming Approach for Secure Semantic Communication
Weixuan Chen, Qianqian Yang
TL;DR
This work targets secure semantic communication over challenging MIMO fading wiretap channels by introducing a DRL-enabled multi-level jamming scheme that combines semantic-layer jamming (text-encoded) and physical-layer jamming (Gaussian noise). A DDPG agent jointly designs three precoders to balance high PSNR for the legitimate user with degraded performance at the eavesdropper, trained via a five-stage alternating strategy that integrates SemCom and reinforcement learning. Experimental results on CIFAR-10 and related text benchmarks show that the proposed approach achieves security comparable to encryption-based and encoded-jammer baselines while improving the legitimate receiver’s PSNR by up to about 0.6 dB across various SNRs and compression ratios. The method offers a practical secure SemCom solution without secret-key exchange and with negligible inference-time overhead, enabling robust privacy protection in shared-channel scenarios.
Abstract
Semantic communication (SemCom) aims to transmit only task-relevant information, thereby improving communication efficiency but also exposing semantic information to potential eavesdropping. In this paper, we propose a deep reinforcement learning (DRL)-empowered multi-level jamming approach to enhance the security of SemCom systems over MIMO fading wiretap channels. This approach combines semantic layer jamming, achieved by encoding task-irrelevant text, and physical layer jamming, achieved by encoding random Gaussian noise. These two-level jamming signals are superposed with task-relevant semantic information to protect the transmitted semantics from eavesdropping. A deep deterministic policy gradient (DDPG) algorithm is further introduced to dynamically design and optimize the precoding matrices for both taskrelevant semantic information and multi-level jamming signals, aiming to enhance the legitimate user's image reconstruction while degrading the eavesdropper's performance. To jointly train the SemCom model and the DDPG agent, we propose an alternating optimization strategy where the two modules are updated iteratively. Experimental results demonstrate that, compared with both the encryption-based (ESCS) and encoded jammer-based (EJ) benchmarks, our method achieves comparable security while improving the legitimate user's peak signalto-noise ratio (PSNR) by up to approximately 0.6 dB.
