Table of Contents
Fetching ...

Certified Adversarial Robustness of Machine Learning-based Malware Detectors via (De)Randomized Smoothing

Daniel Gibert, Luca Demetrio, Giulio Zizzo, Quan Le, Jordi Planes, Battista Biggio

TL;DR

A certifiable defense against patch attacks that guarantees, for a given executable and an adversarial patch size, no adversarial EXEmple exist, and is inspired by (de)randomized smoothing which provides deterministic robustness certificates.

Abstract

Deep learning-based malware detection systems are vulnerable to adversarial EXEmples - carefully-crafted malicious programs that evade detection with minimal perturbation. As such, the community is dedicating effort to develop mechanisms to defend against adversarial EXEmples. However, current randomized smoothing-based defenses are still vulnerable to attacks that inject blocks of adversarial content. In this paper, we introduce a certifiable defense against patch attacks that guarantees, for a given executable and an adversarial patch size, no adversarial EXEmple exist. Our method is inspired by (de)randomized smoothing which provides deterministic robustness certificates. During training, a base classifier is trained using subsets of continguous bytes. At inference time, our defense splits the executable into non-overlapping chunks, classifies each chunk independently, and computes the final prediction through majority voting to minimize the influence of injected content. Furthermore, we introduce a preprocessing step that fixes the size of the sections and headers to a multiple of the chunk size. As a consequence, the injected content is confined to an integer number of chunks without tampering the other chunks containing the real bytes of the input examples, allowing us to extend our certified robustness guarantees to content insertion attacks. We perform an extensive ablation study, by comparing our defense with randomized smoothing-based defenses against a plethora of content manipulation attacks and neural network architectures. Results show that our method exhibits unmatched robustness against strong content-insertion attacks, outperforming randomized smoothing-based defenses in the literature.

Certified Adversarial Robustness of Machine Learning-based Malware Detectors via (De)Randomized Smoothing

TL;DR

A certifiable defense against patch attacks that guarantees, for a given executable and an adversarial patch size, no adversarial EXEmple exist, and is inspired by (de)randomized smoothing which provides deterministic robustness certificates.

Abstract

Deep learning-based malware detection systems are vulnerable to adversarial EXEmples - carefully-crafted malicious programs that evade detection with minimal perturbation. As such, the community is dedicating effort to develop mechanisms to defend against adversarial EXEmples. However, current randomized smoothing-based defenses are still vulnerable to attacks that inject blocks of adversarial content. In this paper, we introduce a certifiable defense against patch attacks that guarantees, for a given executable and an adversarial patch size, no adversarial EXEmple exist. Our method is inspired by (de)randomized smoothing which provides deterministic robustness certificates. During training, a base classifier is trained using subsets of continguous bytes. At inference time, our defense splits the executable into non-overlapping chunks, classifies each chunk independently, and computes the final prediction through majority voting to minimize the influence of injected content. Furthermore, we introduce a preprocessing step that fixes the size of the sections and headers to a multiple of the chunk size. As a consequence, the injected content is confined to an integer number of chunks without tampering the other chunks containing the real bytes of the input examples, allowing us to extend our certified robustness guarantees to content insertion attacks. We perform an extensive ablation study, by comparing our defense with randomized smoothing-based defenses against a plethora of content manipulation attacks and neural network architectures. Results show that our method exhibits unmatched robustness against strong content-insertion attacks, outperforming randomized smoothing-based defenses in the literature.
Paper Structure (21 sections, 4 theorems, 14 equations, 7 figures, 11 tables, 2 algorithms)

This paper contains 21 sections, 4 theorems, 14 equations, 7 figures, 11 tables, 2 algorithms.

Key Result

Theorem 1

For any input example $x$, base classifier $f$, smoothing classifier $g$, smoothing chunk size $z$, and patch size $p$, such that $\Delta = \lceil*\rceil{\frac{p}{z}}+1$, if: where $n_{c^{'}}(x)$ and $n_{c^{"}}(x)$ denote the most commonly and second most commonly predicted classes for an input example $x$, then for any adversarial EXEmple $x^{'}$ which differs from $x$ only in a patch of size $p

Figures (7)

  • Figure 1: An overview of the chunk-based smoothing classification scheme.
  • Figure 2: An illustration of the $\text{PREPROCESS}$ operation. The $\text{PREPROCESS}$ operation pads the headers and sections of Portable Executable files to a multiple of the chunk size $z$ used to split the executable into non-overlapping chunks. By doing so, we ensure that the predictions for the chunks corresponding to the headers and sections not manipulated by the attacker in both the original and adversarial EXEmples remain consistent, allowing us the extend the certification procedure to content injection attacks.
  • Figure 3: Graphical representation of the locations perturbed by different attack strategies.
  • Figure 4: Graphical visualization of the scores of EXEmples belonging to the Autoinject malware family.
  • Figure 5: Graphical visualization of the scores of EXEmples belonging to the GandCrab malware family.
  • ...and 2 more figures

Theorems & Definitions (8)

  • Remark
  • Remark
  • Theorem 1
  • Theorem 2
  • Theorem 1
  • proof
  • Theorem 2
  • proof