Adversarial Pre-Padding: Generating Evasive Network Traffic Against Transformer-Based Classifiers
Quanliang Jing, Xinxin Fan, Yanyan Liu, Jingping Bi
TL;DR
The paper tackles the problem of evading transformer-based traffic classifiers by introducing AdvTraffic, a pre-padding perturbation framework guided by reinforcement learning to produce protocol-compliant, semantically disruptive packets. It analyzes why existing post-padding defenses fail against models like ET-BERT and demonstrates that perturbing the initial, semantically rich bytes yields strong adversarial effects. Through extensive experiments on three real-world datasets, AdvTraffic significantly degrades classifier accuracy under both white-box and black-box defenses and shows transferability across diverse models. The results highlight practical deployment avenues, including caching-based online perturbation and low per-packet latency, while outlining limitations and future directions for defending against such adversarial traffic perturbations.
Abstract
To date, traffic obfuscation techniques have been widely adopted to protect network data privacy and security by obscuring the true patterns of traffic. Nevertheless, as the pre-trained models emerge, especially transformer-based classifiers, existing traffic obfuscation methods become increasingly vulnerable, as witnessed by current studies reporting the traffic classification accuracy up to 99\% or higher. To counter such high-performance transformer-based classification models, we in this paper propose a novel and effective \underline{adv}ersarial \underline{traffic}-generating approach (AdvTraffic\footnote{The code and data are available at: https://anonymous.4open.science/r/TrafficD-C461}). Our approach has two key innovations: (i) a pre-padding strategy is proposed to modify packets, which effectively overcomes the limitations of existing research against transformer-based models for network traffic classification; and (ii) a reinforcement learning model is employed to optimize network traffic perturbations, aiming to maximize adversarial effectiveness against transformer-based classification models. To the best of our knowledge, this is the first attempt to apply adversarial perturbation techniques to defend against transformer-based traffic classifiers. Furthermore, our method can be easily deployed into practical network environments. Finally, multi-faceted experiments are conducted across several real-world datasets, and the experimental results demonstrate that our proposed method can effectively undermine transformer-based classifiers, significantly reducing classification accuracy from 99\% to as low as 25.68\%.
