A Hybrid Training-time and Run-time Defense Against Adversarial Attacks in Modulation Classification
Lu Zhang, Sangarapillai Lambotharan, Gan Zheng, Guisheng Liao, Ambra Demontis, Fabio Roli
TL;DR
The paper tackles the vulnerability of deep learning–based automatic modulation classification (AMC) to adversarial examples in white-box settings. It introduces a hybrid defense (HTRD) that combines Customized Adversarial Training (CAT) with adaptive perturbation budgets $\epsilon_i$ and adaptive label smoothing $\tilde{y}_i$, and a run-time neural rejection (NR) detector based on an $\text{RBF}$-SVM trained on last-layer features, with a rejection threshold $S_0$ and a white-box attack constrained by $||x-x'||_2 \le \varepsilon$ where $\varepsilon = \sqrt{PNR \cdot \|x\|_2^2 /(SNR+1)}$. The CAT training enlarges the input-space margin, yielding clearer class separation at the last feature layer and a larger rejection region, while the NR detector provides a guard against low-confidence misclassifications. On the RML2016.10a dataset, CAT outperforms LS-GNA in DNN defenses, and the combined HTRD approach outperforms prior two-fold defenses and NR-based schemes, achieving higher robustness with minimal loss in benign accuracy. The results suggest that jointly optimizing training-time robustness with a run-time rejection mechanism can meaningfully strengthen modulation classification in adversarial environments, with practical implications for secure cognitive radio and related wireless systems. The methodology leverages $\epsilon_i$-based adaptive adversarial training, adaptive label smoothing, and SVM-based anomaly rejection to raise the cost and difficulty of successful attacks.
Abstract
Motivated by the superior performance of deep learning in many applications including computer vision and natural language processing, several recent studies have focused on applying deep neural network for devising future generations of wireless networks. However, several recent works have pointed out that imperceptible and carefully designed adversarial examples (attacks) can significantly deteriorate the classification accuracy. In this paper, we investigate a defense mechanism based on both training-time and run-time defense techniques for protecting machine learning-based radio signal (modulation) classification against adversarial attacks. The training-time defense consists of adversarial training and label smoothing, while the run-time defense employs a support vector machine-based neural rejection (NR). Considering a white-box scenario and real datasets, we demonstrate that our proposed techniques outperform existing state-of-the-art technologies.
