Table of Contents
Fetching ...

Bayesian Learned Models Can Detect Adversarial Malware For Free

Bao Gia Doan, Dang Quang Nguyen, Paul Montague, Tamas Abraham, Olivier De Vel, Seyit Camtepe, Salil S. Kanhere, Ehsan Abbasnejad, Damith C. Ranasinghe

TL;DR

Bayesian models are found to be generally capable of identifying adversarial malware in both feature and problem space, and with a diversity-promoting approach, lead to parameter instances from the posterior to significantly enhance a detectors' ability.

Abstract

The vulnerability of machine learning-based malware detectors to adversarial attacks has prompted the need for robust solutions. Adversarial training is an effective method but is computationally expensive to scale up to large datasets and comes at the cost of sacrificing model performance for robustness. We hypothesize that adversarial malware exploits the low-confidence regions of models and can be identified using epistemic uncertainty of ML approaches -- epistemic uncertainty in a machine learning-based malware detector is a result of a lack of similar training samples in regions of the problem space. In particular, a Bayesian formulation can capture the model parameters' distribution and quantify epistemic uncertainty without sacrificing model performance. To verify our hypothesis, we consider Bayesian learning approaches with a mutual information-based formulation to quantify uncertainty and detect adversarial malware in Android, Windows domains and PDF malware. We found, quantifying uncertainty through Bayesian learning methods can defend against adversarial malware. In particular, Bayesian models: (1) are generally capable of identifying adversarial malware in both feature and problem space, (2) can detect concept drift by measuring uncertainty, and (3) with a diversity-promoting approach (or better posterior approximations) lead to parameter instances from the posterior to significantly enhance a detectors' ability.

Bayesian Learned Models Can Detect Adversarial Malware For Free

TL;DR

Bayesian models are found to be generally capable of identifying adversarial malware in both feature and problem space, and with a diversity-promoting approach, lead to parameter instances from the posterior to significantly enhance a detectors' ability.

Abstract

The vulnerability of machine learning-based malware detectors to adversarial attacks has prompted the need for robust solutions. Adversarial training is an effective method but is computationally expensive to scale up to large datasets and comes at the cost of sacrificing model performance for robustness. We hypothesize that adversarial malware exploits the low-confidence regions of models and can be identified using epistemic uncertainty of ML approaches -- epistemic uncertainty in a machine learning-based malware detector is a result of a lack of similar training samples in regions of the problem space. In particular, a Bayesian formulation can capture the model parameters' distribution and quantify epistemic uncertainty without sacrificing model performance. To verify our hypothesis, we consider Bayesian learning approaches with a mutual information-based formulation to quantify uncertainty and detect adversarial malware in Android, Windows domains and PDF malware. We found, quantifying uncertainty through Bayesian learning methods can defend against adversarial malware. In particular, Bayesian models: (1) are generally capable of identifying adversarial malware in both feature and problem space, (2) can detect concept drift by measuring uncertainty, and (3) with a diversity-promoting approach (or better posterior approximations) lead to parameter instances from the posterior to significantly enhance a detectors' ability.
Paper Structure (19 sections, 9 equations, 8 figures, 5 tables)

This paper contains 19 sections, 9 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Illustration of functional, realistic, adversarial malware in the problem space, where $\textbf{z}'$ is the transformation of $\textbf{z}$ (a malware app) that passes the decision boundary in the detector's feature space and successfully fools the malware detector whilst satisfying problem-space constraints $\Omega$. The white areas, outside of the training data submanifolds, are regions of high uncertainty for ML-based malware detectors.
  • Figure 2: Using mutual information and predictive entropy to detect problem-space Android adversarial malware from SP'20 attacks with a budget $\epsilon=90$ (FFNN is a non-Bayesian baseline).
  • Figure 3: Performance of our proposed method to detect feature space PGD-L1 adversarial Android malware with a budget $\epsilon$ = 60 (FFNN is a non-Bayesian baseline).
  • Figure 4: Performance of our proposed method to detect feature space BCA and Grosse adversarial Android malware with a budget $\epsilon$ = 10. (FFNN is a non-Bayesian baseline).
  • Figure 5: Performance of our proposed method to detect PDF adversarial malware with an attack budget $\epsilon=7$. (FFNN is a non-Bayesian baseline).
  • ...and 3 more figures