A Survey of Malware Detection Using Deep Learning
Ahmed Bensaoud, Jugal Kalita, Mahmoud Bensaoud
TL;DR
This survey analyzes how deep learning is applied to malware detection across Windows, macOS, Linux, Android, and iOS, detailing static, dynamic, and hybrid detection methods while highlighting benchmark challenges and the need for robust, explainable models. It covers malware image classification, transfer learning, NLP, and XAI, and discusses adversarial robustness, data generation, and cryptographic ransomware detection with references to datasets such as MalImg, BIG2015, and the Microsoft Malware Dataset. The paper emphasizes practical considerations like model generalization, training efficiency, and cross-platform applicability, and it evaluates a range of DL approaches including EfficientNet variants and CapsNet in the context of malware imagery. It also identifies gaps in standard benchmarks, interpretability, and defense against adversarial threats, proposing future directions such as multi-task learning, cross-dataset evaluation, and integration of XAI for trustworthy defense systems.
Abstract
The problem of malicious software (malware) detection and classification is a complex task, and there is no perfect approach. There is still a lot of work to be done. Unlike most other research areas, standard benchmarks are difficult to find for malware detection. This paper aims to investigate recent advances in malware detection on MacOS, Windows, iOS, Android, and Linux using deep learning (DL) by investigating DL in text and image classification, the use of pre-trained and multi-task learning models for malware detection approaches to obtain high accuracy and which the best approach if we have a standard benchmark dataset. We discuss the issues and the challenges in malware detection using DL classifiers by reviewing the effectiveness of these DL classifiers and their inability to explain their decisions and actions to DL developers presenting the need to use Explainable Machine Learning (XAI) or Interpretable Machine Learning (IML) programs. Additionally, we discuss the impact of adversarial attacks on deep learning models, negatively affecting their generalization capabilities and resulting in poor performance on unseen data. We believe there is a need to train and test the effectiveness and efficiency of the current state-of-the-art deep learning models on different malware datasets. We examine eight popular DL approaches on various datasets. This survey will help researchers develop a general understanding of malware recognition using deep learning.
