Table of Contents
Fetching ...

A Robust Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via (De)Randomized Smoothing

Daniel Gibert, Giulio Zizzo, Quan Le, Jordi Planes

TL;DR

The paper tackles the vulnerability of deep learning–based malware detectors to adversarial, functionality-preserving modifications of executables. It introduces a chunk-based (de)randomized smoothing framework, training a base classifier on ablated byte chunks and aggregating predictions over $L$ chunks at inference to achieve robust final decisions; two schemes are proposed: Randomized Chunk-based Ablation (RCA) and Structural Chunk-based Ablation (SCA). Empirical evaluation on the BODMAS dataset with MalConv-based detectors shows that RCA-MalConv and SCA-MalConv achieve comparable accuracy on clean data while significantly improving resilience against state-of-the-art evasion attacks (Slack+Padding, Shift, GAMMA, Code Caves) relative to non-smoothed and prior smoothing methods. The approach is model-agnostic, interpretable at the chunk level, and offers a practical defense against adversarial malware examples with a favorable trade-off between robustness and efficiency. The work lays a foundation for broader adoption of chunk-wise ablation smoothing and points to future work on more granular chunk labeling and adversarial content removal.

Abstract

Deep learning-based malware detectors have been shown to be susceptible to adversarial malware examples, i.e. malware examples that have been deliberately manipulated in order to avoid detection. In light of the vulnerability of deep learning detectors to subtle input file modifications, we propose a practical defense against adversarial malware examples inspired by (de)randomized smoothing. In this work, we reduce the chances of sampling adversarial content injected by malware authors by selecting correlated subsets of bytes, rather than using Gaussian noise to randomize inputs like in the Computer Vision (CV) domain. During training, our ablation-based smoothing scheme trains a base classifier to make classifications on a subset of contiguous bytes or chunk of bytes. At test time, a large number of chunks are then classified by a base classifier and the consensus among these classifications is then reported as the final prediction. We propose two strategies to determine the location of the chunks used for classification: (1) randomly selecting the locations of the chunks and (2) selecting contiguous adjacent chunks. To showcase the effectiveness of our approach, we have trained two classifiers with our chunk-based ablation schemes on the BODMAS dataset. Our findings reveal that the chunk-based smoothing classifiers exhibit greater resilience against adversarial malware examples generated with state-of-the-are evasion attacks, outperforming a non-smoothed classifier and a randomized smoothing-based classifier by a great margin.

A Robust Defense against Adversarial Attacks on Deep Learning-based Malware Detectors via (De)Randomized Smoothing

TL;DR

The paper tackles the vulnerability of deep learning–based malware detectors to adversarial, functionality-preserving modifications of executables. It introduces a chunk-based (de)randomized smoothing framework, training a base classifier on ablated byte chunks and aggregating predictions over chunks at inference to achieve robust final decisions; two schemes are proposed: Randomized Chunk-based Ablation (RCA) and Structural Chunk-based Ablation (SCA). Empirical evaluation on the BODMAS dataset with MalConv-based detectors shows that RCA-MalConv and SCA-MalConv achieve comparable accuracy on clean data while significantly improving resilience against state-of-the-art evasion attacks (Slack+Padding, Shift, GAMMA, Code Caves) relative to non-smoothed and prior smoothing methods. The approach is model-agnostic, interpretable at the chunk level, and offers a practical defense against adversarial malware examples with a favorable trade-off between robustness and efficiency. The work lays a foundation for broader adoption of chunk-wise ablation smoothing and points to future work on more granular chunk labeling and adversarial content removal.

Abstract

Deep learning-based malware detectors have been shown to be susceptible to adversarial malware examples, i.e. malware examples that have been deliberately manipulated in order to avoid detection. In light of the vulnerability of deep learning detectors to subtle input file modifications, we propose a practical defense against adversarial malware examples inspired by (de)randomized smoothing. In this work, we reduce the chances of sampling adversarial content injected by malware authors by selecting correlated subsets of bytes, rather than using Gaussian noise to randomize inputs like in the Computer Vision (CV) domain. During training, our ablation-based smoothing scheme trains a base classifier to make classifications on a subset of contiguous bytes or chunk of bytes. At test time, a large number of chunks are then classified by a base classifier and the consensus among these classifications is then reported as the final prediction. We propose two strategies to determine the location of the chunks used for classification: (1) randomly selecting the locations of the chunks and (2) selecting contiguous adjacent chunks. To showcase the effectiveness of our approach, we have trained two classifiers with our chunk-based ablation schemes on the BODMAS dataset. Our findings reveal that the chunk-based smoothing classifiers exhibit greater resilience against adversarial malware examples generated with state-of-the-are evasion attacks, outperforming a non-smoothed classifier and a randomized smoothing-based classifier by a great margin.
Paper Structure (18 sections, 2 equations, 7 figures, 4 tables, 3 algorithms)

This paper contains 18 sections, 2 equations, 7 figures, 4 tables, 3 algorithms.

Figures (7)

  • Figure 1: A graphical depiction of the PE file format and some practical manipulations DBLP:conf/sp/SuciuCJ19demetrio2021adversarialdemetrio2021functionalityYUSTE2022102643.
  • Figure 2: Illustration of the chunk-based ablation smoothing scheme. The preprocessing step extracts chunks of the input example as described in Sections \ref{['sec:randomized_ablation']} and \ref{['sec:structural_ablation']}
  • Figure 3: Graphical depiction of the MalConv architecture.
  • Figure 4: Detection accuracy of the malware detectors on the adversarial examples generated by Suciu et al DBLP:conf/sp/SuciuCJ19.
  • Figure 5: Detection accuracy of the malware detectors on the adversarial examples generated by the Shift attack demetrio2021adversarial.
  • ...and 2 more figures