Introducing Adaptive Continuous Adversarial Training (ACAT) to Enhance ML Robustness
Mohamed elShehaby, Aditya Kotha, Ashraf Matrawy
TL;DR
ACAT addresses the scarcity of labeled adversarial data and the challenge of concept drift in cybersecurity by continuously injecting real-world detected adversarial samples into ongoing model training, using Elastic Weight Consolidation to mitigate forgetting. The method is validated on a problem-space SPAM filtering task with TextFooler-based perturbations, demonstrating robustness gains (accuracy rising from ~69% to ~88% after three retraining sessions) and significantly faster inference than conventional two-model approaches. These findings suggest that continual, adaptive adversarial training can provide practical, scalable defenses against evolving attacks in security-sensitive domains, reducing retraining overhead while maintaining performance on clean data.
Abstract
Adversarial training enhances the robustness of Machine Learning (ML) models against adversarial attacks. However, obtaining labeled training and adversarial training data in network/cybersecurity domains is challenging and costly. Therefore, this letter introduces Adaptive Continuous Adversarial Training (ACAT), a method that integrates adversarial training samples into the model during continuous learning sessions using real-world detected adversarial data. Experimental results with a SPAM detection dataset demonstrate that ACAT reduces the time required for adversarial sample detection compared to traditional processes. Moreover, the accuracy of the under-attack ML-based SPAM filter increased from 69% to over 88% after just three retraining sessions.
