Mal-D2GAN: Double-Detector based GAN for Malware Generation
Nam Hoang Thanh, Trung Pham Duy, Lam Bui Thu
TL;DR
Malware detectors based on ML are vulnerable to adversarial manipulation. The authors propose Mal-D2GAN, a double-detector GAN with a least-squares loss that uses a substitute and an additional detector to guide the generator in crafting strong adversarial malware samples that bypass a black-box detector. Evaluated on a 20,000-sample PE feature dataset, Mal-D2GAN substantially reduces true positive rates across eight detectors and outperforms MalGAN and Mal-LSGAN, with demonstrated resilience under detector retraining. The work provides a framework for stress-testing and potentially hardening malware detectors, highlighting both the effectiveness and the ongoing arms race between adversarial generation and detector robustness.
Abstract
Machine learning (ML) has been developed to detect malware in recent years. Most researchers focused their efforts on improving the detection performance but ignored the robustness of the ML models. In addition, many machine learning algorithms are very vulnerable to intentional attacks. To solve these problems, adversarial malware examples are generated by GANs to enhance the robustness of the malware detector. However, since current GAN models suffer from limitations such as unstable training and weak adversarial examples, we propose the Mal-D2GAN model to address these problems. Specifically, the Mal-D2GAN architecture was designed with double-detector and a least square loss function and tested on a dataset of 20,000 samples. The results show that the Mal-D2GAN model reduced the detection accuracy (true positive rate) in 8 malware detectors. The performance was then compared with that of the existing MalGAN and Mal- LSGAN models.
