Table of Contents
Fetching ...

Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks

Milad Nasr, Yanick Fratantonio, Luca Invernizzi, Ange Albertini, Loua Farah, Alex Petit-Bianco, Andreas Terzis, Kurt Thomas, Elie Bursztein, Nicholas Carlini

TL;DR

The paper addresses how transferable adversarial attacks on a production ML component can compromise a malware-detection pipeline, using Gmail's Magika as a case study. It demonstrates that small in-file perturbations can reroute inputs to inappropriate detectors, enabling end-to-end intrusion in Gmail under controlled conditions. The authors develop defenses, including an AES-based preprocessing approach, and, in collaboration with Google, deploy a production solution that substantially raises the attacker’s cost while maintaining practical performance. The work advocates for pragmatic security engineering in real-world ML deployments and provides a framework for evaluating end-to-end robustness in systems with ML components.

Abstract

As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level vulnerabilities with real-world impact. This paper studies how adversarial attacks targeting an ML component can degrade or bypass an entire production-grade malware detection system, performing a case study analysis of Gmail's pipeline where file-type identification relies on a ML model. The malware detection pipeline in use by Gmail contains a machine learning model that routes each potential malware sample to a specialized malware classifier to improve accuracy and performance. This model, called Magika, has been open sourced. By designing adversarial examples that fool Magika, we can cause the production malware service to incorrectly route malware to an unsuitable malware detector thereby increasing our chance of evading detection. Specifically, by changing just 13 bytes of a malware sample, we can successfully evade Magika in 90% of cases and thereby allow us to send malware files over Gmail. We then turn our attention to defenses, and develop an approach to mitigate the severity of these types of attacks. For our defended production model, a highly resourced adversary requires 50 bytes to achieve just a 20% attack success rate. We implement this defense, and, thanks to a collaboration with Google engineers, it has already been deployed in production for the Gmail classifier.

Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks

TL;DR

The paper addresses how transferable adversarial attacks on a production ML component can compromise a malware-detection pipeline, using Gmail's Magika as a case study. It demonstrates that small in-file perturbations can reroute inputs to inappropriate detectors, enabling end-to-end intrusion in Gmail under controlled conditions. The authors develop defenses, including an AES-based preprocessing approach, and, in collaboration with Google, deploy a production solution that substantially raises the attacker’s cost while maintaining practical performance. The work advocates for pragmatic security engineering in real-world ML deployments and provides a framework for evaluating end-to-end robustness in systems with ML components.

Abstract

As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level vulnerabilities with real-world impact. This paper studies how adversarial attacks targeting an ML component can degrade or bypass an entire production-grade malware detection system, performing a case study analysis of Gmail's pipeline where file-type identification relies on a ML model. The malware detection pipeline in use by Gmail contains a machine learning model that routes each potential malware sample to a specialized malware classifier to improve accuracy and performance. This model, called Magika, has been open sourced. By designing adversarial examples that fool Magika, we can cause the production malware service to incorrectly route malware to an unsuitable malware detector thereby increasing our chance of evading detection. Specifically, by changing just 13 bytes of a malware sample, we can successfully evade Magika in 90% of cases and thereby allow us to send malware files over Gmail. We then turn our attention to defenses, and develop an approach to mitigate the severity of these types of attacks. For our defended production model, a highly resourced adversary requires 50 bytes to achieve just a 20% attack success rate. We implement this defense, and, thanks to a collaboration with Google engineers, it has already been deployed in production for the Gmail classifier.

Paper Structure

This paper contains 45 sections, 2 equations, 18 figures, 3 algorithms.

Figures (18)

  • Figure 1: Overview of our attack strategy against Gmail. An attacker aims to send a malicious file as an email attachment. The attacker adversarially modifies the file such that the Magika file-type detector run by Google's servers misroutes the file to an incorrect, specialized security scanner (e.g., a PDF is sent to a Windows executable scanner), thus evading detection. Absent these adversarial modifications, the file would be sent to the correct security scanner and be accurately detected as malicious.
  • Figure 2: Schematic of our attack. Magika processes an arbitrarily large file by first extracting just 1536 bytes (the first, middle, and last 512 bytes), and then classifying these 1536 bytes with a neural network. By modifying just five of these bytes, we will show how to successfully cause the classifier to assign arbitrary incorrect file type to files, and thus route the malware sample to the wrong specialized classifier.
  • Figure 3: Attack success rate as a function of the number of bytes modified. By modifying just 5 bytes, our attack can cause $90\%$ of malicious files to be misclassified by Magika.
  • Figure 4: The cumulative number of changes required to the classify each file format file preserving the format.
  • Figure 5: A proof of concept for the attack that can be used in GMail to launch an end to end attack. Given the different in how different operating system select what applications to open the files
  • ...and 13 more figures