Deep Multi-Task Learning for Malware Image Classification

Ahmed Bensaoud; Jugal Kalita

Deep Multi-Task Learning for Malware Image Classification

Ahmed Bensaoud, Jugal Kalita

TL;DR

This work tackles malware detection by reframing it as color-image classification and solving it with a deep multi-task learning framework across binaries from Windows, Android, Linux, MacOS, and iOS. It combines large-scale, multi-format data with CycleGAN-driven data augmentation for MacOS samples and a seven-task CNN with PReLU activations to achieve near-perfect accuracy. The study demonstrates that multi-task learning improves performance over single-task baselines and that color images capture richer discriminative information than grayscale. The results, on a public dataset, suggest strong practical potential for fast, robust malware detection against obfuscation techniques.

Abstract

Malicious software is a pernicious global problem. A novel multi-task learning framework is proposed in this paper for malware image classification for accurate and fast malware detection. We generate bitmap (BMP) and (PNG) images from malware features, which we feed to a deep learning classifier. Our state-of-the-art multi-task learning approach has been tested on a new dataset, for which we have collected approximately 100,000 benign and malicious PE, APK, Mach-o, and ELF examples. Experiments with seven tasks tested with 4 activation functions, ReLU, LeakyReLU, PReLU, and ELU separately demonstrate that PReLU gives the highest accuracy of more than 99.87% on all tasks. Our model can effectively detect a variety of obfuscation methods like packing, encryption, and instruction overlapping, strengthing the beneficial claims of our model, in addition to achieving the state-of-art methods in terms of accuracy.

Deep Multi-Task Learning for Malware Image Classification

TL;DR

Abstract

Paper Structure (33 sections, 2 equations, 21 figures, 7 tables)

This paper contains 33 sections, 2 equations, 21 figures, 7 tables.

Introduction
Related work
Malware detection
Malware Image
Multi-Task Learning
Methodology
PE Malware
Executable and Linkable Format (ELF)
MacOS X and iSO Malware
Android Malware
Unpacking Malware
Assembling Code to Image
Generating Images
Malware Image Generation
Bitmap (BMP) and Portable Network Graphics (PNG)
...and 18 more sections

Figures (21)

Figure 1: Total number of malware is increasing from quarter to quarter
Figure 2: Hard parameter sharing for multi-task learning (MTL)
Figure 3: Soft parameter sharing for multi-task learning (MTL)
Figure 4: PE file structure
Figure 5: Converting MacOS malware Mach-o file to image
...and 16 more figures

Deep Multi-Task Learning for Malware Image Classification

TL;DR

Abstract

Deep Multi-Task Learning for Malware Image Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (21)