Table of Contents
Fetching ...

A Visualized Malware Detection Framework with CNN and Conditional GAN

Fang Wang, Hussam Al Hamadi, Ernesto Damiani

TL;DR

An integrated framework for addressing common problems experienced by ML utilizers in developing malware detection systems is proposed, designed to preserve the identities of benign/malign samples by encoding each variable into binary digits and mapping them into black and white pixels.

Abstract

Malware visualization analysis incorporating with Machine Learning (ML) has been proven to be a promising solution for improving security defenses on different platforms. In this work, we propose an integrated framework for addressing common problems experienced by ML utilizers in developing malware detection systems. Namely, a pictorial presentation system with extensions is designed to preserve the identities of benign/malign samples by encoding each variable into binary digits and mapping them into black and white pixels. A conditional Generative Adversarial Network based model is adopted to produce synthetic images and mitigate issues of imbalance classes. Detection models architected by Convolutional Neural Networks are for validating performances while training on datasets with and without artifactual samples. Result demonstrates accuracy rates of 98.51% and 97.26% for these two training scenarios.

A Visualized Malware Detection Framework with CNN and Conditional GAN

TL;DR

An integrated framework for addressing common problems experienced by ML utilizers in developing malware detection systems is proposed, designed to preserve the identities of benign/malign samples by encoding each variable into binary digits and mapping them into black and white pixels.

Abstract

Malware visualization analysis incorporating with Machine Learning (ML) has been proven to be a promising solution for improving security defenses on different platforms. In this work, we propose an integrated framework for addressing common problems experienced by ML utilizers in developing malware detection systems. Namely, a pictorial presentation system with extensions is designed to preserve the identities of benign/malign samples by encoding each variable into binary digits and mapping them into black and white pixels. A conditional Generative Adversarial Network based model is adopted to produce synthetic images and mitigate issues of imbalance classes. Detection models architected by Convolutional Neural Networks are for validating performances while training on datasets with and without artifactual samples. Result demonstrates accuracy rates of 98.51% and 97.26% for these two training scenarios.
Paper Structure (16 sections, 5 equations, 7 figures, 1 table)

This paper contains 16 sections, 5 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: A Demonstration of PRS as $x_{1} = 19$ and $x_{2} = 22$
  • Figure 2: CNN Model Architecture
  • Figure 3: cGAN Model Demonstration
  • Figure 4: Benign and Malign Samples Generated through PRS
  • Figure 5: The Learning Curve of cGAN Model. The top panel shows the discriminator loss for real images (blue), discriminator loss for generated fake images (orange), and the generator loss for generated fake images (green). The bottom panel shows the discriminator accuracy on real (blue) and fake (orange) images during training. On both panels the convergence is achieved at around 1500 iterations.
  • ...and 2 more figures