Deep Learning Fusion For Effective Malware Detection: Leveraging Visual Features

Jahez Abraham Johny; Vinod P.; Asmitha K. A.; G. Radhamani; Rafidha Rehiman K. A.; Mauro Conti

Deep Learning Fusion For Effective Malware Detection: Leveraging Visual Features

Jahez Abraham Johny, Vinod P., Asmitha K. A., G. Radhamani, Rafidha Rehiman K. A., Mauro Conti

TL;DR

The paper tackles malware detection by fusing three visual representations of binaries (Grayscale Image, Entropy Graph, SimHash Image) using separate VGG16 branches and various fusion operators. It evaluates how fusion choices affect classification performance, revealing that concatenation across all modalities yields near-perfect F1 scores on the BIG2015 dataset and robust performance against obfuscated samples. Interpretability is addressed with Grad-CAM and t-SNE visualizations, showing meaningful, explorable feature regions and clusters that support trust in the model. The approach achieves real-time detection and demonstrates advantages over prior methods by preserving modality-specific information and providing insights into decision-making through activation maps and cluster analyses.

Abstract

Malware has become a formidable threat as it has been growing exponentially in number and sophistication, thus, it is imperative to have a solution that is easy to implement, reliable, and effective. While recent research has introduced deep learning multi-feature fusion algorithms, they lack a proper explanation. In this work, we investigate the power of fusing Convolutional Neural Network models trained on different modalities of a malware executable. We are proposing a novel multimodal fusion algorithm, leveraging three different visual malware features: Grayscale Image, Entropy Graph, and SimHash Image, with which we conducted exhaustive experiments independently on each feature and combinations of all three of them using fusion operators such as average, maximum, add, and concatenate for effective malware detection and classification. The proposed strategy has a detection rate of 1.00 (on a scale of 0-1) in identifying malware in the given dataset. We explained its interpretability with visualization techniques such as t-SNE and Grad-CAM. Experimental results show the model works even for a highly imbalanced dataset. We also assessed the effectiveness of the proposed method on obfuscated malware and achieved state-of-the-art results. The proposed methodology is more reliable as our findings prove VGG16 model can detect and classify malware in a matter of seconds in real-time.

Deep Learning Fusion For Effective Malware Detection: Leveraging Visual Features

TL;DR

Abstract

Paper Structure (17 sections, 7 equations, 12 figures, 7 tables)

This paper contains 17 sections, 7 equations, 12 figures, 7 tables.

Introduction
Related work
Proposed methodology
Data Collection
Feature Extraction
Grayscale Image
Entropy Graph
Simhash Image
Classification Model
Experiments and Results
Experimental Set-up
Evaluation Metrics
Experiment-1: Effectiveness of GS, EG, and SH VGG16 models in classifying malware binaries
Experiment-2: Assessing the merging of malware visualization models using various merge operators
Experiment-3: Interpretability of the proposed method
...and 2 more sections

Figures (12)

Figure 1: Process of Classification, with three features and VGG16 Models of each feature. Each model is then fused using the operator $\odot$ which can be either add, max, avg, or concatenation
Figure 2: Highly imbalanced dataset of 9 malware families
Figure 3: Grid representation of calculating the interpolated pixel P. Pixels: T11, T12, T21, T22 are used used to calculate $S_{1}$ and $S_{2}$, then from which P
Figure 4: Grayscale, Entropy Graph, and Simhash of malware families (Gatak, Kelihos_ver3, and Vundo)
Figure 5: Proposed Architecture
...and 7 more figures

Deep Learning Fusion For Effective Malware Detection: Leveraging Visual Features

TL;DR

Abstract

Deep Learning Fusion For Effective Malware Detection: Leveraging Visual Features

Authors

TL;DR

Abstract

Table of Contents

Figures (12)