Explainable Deep Learning Models for Dynamic and Online Malware Classification

Quincy Card; Daniel Simpson; Kshitiz Aryal; Maanak Gupta; Sheikh Rabiul Islam

Explainable Deep Learning Models for Dynamic and Online Malware Classification

Quincy Card, Daniel Simpson, Kshitiz Aryal, Maanak Gupta, Sheikh Rabiul Islam

TL;DR

This work addresses the need for interpretable malware classification in both dynamic and online execution environments. It trains FFNN and CNN models on feature sets drawn from dynamic Android and online Windows datasets, and applies SHAP, LIME, and Permutation Importance to provide global and local explanations of predictions. The study demonstrates competitive classification performance, shows the benefits of SMOTE for imbalanced data, and analyzes the computational costs and practical robustness of explanation methods in time-series contexts. The findings offer guidance for deploying real-time, interpretable malware detectors and highlight future directions, including time-series explainability and adversarial considerations.

Abstract

In recent years, there has been a significant surge in malware attacks, necessitating more advanced preventive measures and remedial strategies. While several successful AI-based malware classification approaches exist categorized into static, dynamic, or online analysis, most successful AI models lack easily interpretable decisions and explanations for their processes. Our paper aims to delve into explainable malware classification across various execution environments (such as dynamic and online), thoroughly analyzing their respective strengths, weaknesses, and commonalities. To evaluate our approach, we train Feed Forward Neural Networks (FFNN) and Convolutional Neural Networks (CNN) to classify malware based on features obtained from dynamic and online analysis environments. The feature attribution for malware classification is performed by explainability tools, SHAP, LIME and Permutation Importance. We perform a detailed evaluation of the calculated global and local explanations from the experiments, discuss limitations and, ultimately, offer recommendations for achieving a balanced approach.

Explainable Deep Learning Models for Dynamic and Online Malware Classification

TL;DR

Abstract

Paper Structure (14 sections, 8 figures, 7 tables)

This paper contains 14 sections, 8 figures, 7 tables.

Introduction
Related Works
Dynamic Analysis
Online Analysis
Explainable AI
Methodology
Dynamic Analysis
Online Analysis
Explainability Approach
Results and Discussion
Evaluation of Performance Metrics
Global Explanation
Local Explanation
Conclusion and Future Work

Figures (8)

Figure 1: Performance of models in Online Analysis
Figure 2: Performance of models in Dynamic Analysis
Figure 3: A stacked bar graph depicting the top 10 online features identified by SHAP in model decision making
Figure 4: A stacked bar graph depicting the top 10 online features identified by SHAP in model decision making
Figure 5: Waterfall plots - Local interpretations of misclassified Riskware sample of dynamic data set
...and 3 more figures

Explainable Deep Learning Models for Dynamic and Online Malware Classification

TL;DR

Abstract

Explainable Deep Learning Models for Dynamic and Online Malware Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (8)