Table of Contents
Fetching ...

Machine Learning for Windows Malware Detection and Classification: Methods, Challenges and Ongoing Research

Daniel Gibert

TL;DR

This chapter starts by introducing the main components of a Machine Learning pipeline, highlighting the challenges of collecting and maintaining up-to-date datasets, and introduces the primary challenges encountered by machine learning-based malware detectors, including concept drift and adversarial attacks.

Abstract

In this chapter, readers will explore how machine learning has been applied to build malware detection systems designed for the Windows operating system. This chapter starts by introducing the main components of a Machine Learning pipeline, highlighting the challenges of collecting and maintaining up-to-date datasets. Following this introduction, various state-of-the-art malware detectors are presented, encompassing both feature-based and deep learning-based detectors. Subsequent sections introduce the primary challenges encountered by machine learning-based malware detectors, including concept drift and adversarial attacks. Lastly, this chapter concludes by providing a brief overview of the ongoing research on adversarial defenses.

Machine Learning for Windows Malware Detection and Classification: Methods, Challenges and Ongoing Research

TL;DR

This chapter starts by introducing the main components of a Machine Learning pipeline, highlighting the challenges of collecting and maintaining up-to-date datasets, and introduces the primary challenges encountered by machine learning-based malware detectors, including concept drift and adversarial attacks.

Abstract

In this chapter, readers will explore how machine learning has been applied to build malware detection systems designed for the Windows operating system. This chapter starts by introducing the main components of a Machine Learning pipeline, highlighting the challenges of collecting and maintaining up-to-date datasets. Following this introduction, various state-of-the-art malware detectors are presented, encompassing both feature-based and deep learning-based detectors. Subsequent sections introduce the primary challenges encountered by machine learning-based malware detectors, including concept drift and adversarial attacks. Lastly, this chapter concludes by providing a brief overview of the ongoing research on adversarial defenses.
Paper Structure (25 sections, 1 equation, 16 figures, 2 tables)

This paper contains 25 sections, 1 equation, 16 figures, 2 tables.

Figures (16)

  • Figure 1: A graphical depiction of the PE file format.
  • Figure 2: MalConv architecture DBLP:conf/aaai/RaffBSBCN18 .
  • Figure 3: AvastConv architecture.
  • Figure 4: ShallowConv architecture DBLP:conf/ccia/GibertBMPSV17GIBERT2021102159.
  • Figure 5: Grayscale image representation of malware binaries belonging to the Kelihos_ver1, Obfuscator.ACY and Gatak families, respectively DBLP:journals/virology/GibertMPV19.
  • ...and 11 more figures