Table of Contents
Fetching ...

Trust Under Siege: Label Spoofing Attacks against Machine Learning for Android Malware Detection

Tianwei Lan, Luca Demetrio, Farid Nait-Abdesselam, Yufei Han, Simone Aonzo

TL;DR

This work exposes a real-world threat to ML-based Android malware detectors that rely on crowd-sourced AV labels by introducing AndroVenom, a label spoofing framework that injects minimal malicious patterns into benign APKs to coerce AVs into mislabeling them as malware. The authors demonstrate the feasibility of repackaging-based injections, show that even a tiny fraction of poisoned samples can cause DoS and targeted integrity attacks, and reveal that state-of-the-art anomaly detectors may fail to curb such poisoning. Across large-scale experiments with Drebin and MaMaDroid features, AndroVenom achieves significant reductions in test-time performance and high targeted misclassification rates with extremely small poisoning budgets. The study highlights the critical need to reassess trust in AV-based annotations for ML training and motivates development of stronger data provenance, richer feature sources, and robust defense mechanisms to mitigate such label spoofing attacks.

Abstract

Machine learning (ML) malware detectors rely heavily on crowd-sourced AntiVirus (AV) labels, with platforms like VirusTotal serving as a trusted source of malware annotations. But what if attackers could manipulate these labels to classify benign software as malicious? We introduce label spoofing attacks, a new threat that contaminates crowd-sourced datasets by embedding minimal and undetectable malicious patterns into benign samples. These patterns coerce AV engines into misclassifying legitimate files as harmful, enabling poisoning attacks against ML-based malware classifiers trained on those data. We demonstrate this scenario by developing AndroVenom, a methodology for polluting realistic data sources, causing consequent poisoning attacks against ML malware detectors. Experiments show that not only state-of-the-art feature extractors are unable to filter such injection, but also various ML models experience Denial of Service already with 1% poisoned samples. Additionally, attackers can flip decisions of specific unaltered benign samples by modifying only 0.015% of the training data, threatening their reputation and market share and being unable to be stopped by anomaly detectors on training data. We conclude our manuscript by raising the alarm on the trustworthiness of the training process based on AV annotations, requiring further investigation on how to produce proper labels for ML malware detectors.

Trust Under Siege: Label Spoofing Attacks against Machine Learning for Android Malware Detection

TL;DR

This work exposes a real-world threat to ML-based Android malware detectors that rely on crowd-sourced AV labels by introducing AndroVenom, a label spoofing framework that injects minimal malicious patterns into benign APKs to coerce AVs into mislabeling them as malware. The authors demonstrate the feasibility of repackaging-based injections, show that even a tiny fraction of poisoned samples can cause DoS and targeted integrity attacks, and reveal that state-of-the-art anomaly detectors may fail to curb such poisoning. Across large-scale experiments with Drebin and MaMaDroid features, AndroVenom achieves significant reductions in test-time performance and high targeted misclassification rates with extremely small poisoning budgets. The study highlights the critical need to reassess trust in AV-based annotations for ML training and motivates development of stronger data provenance, richer feature sources, and robust defense mechanisms to mitigate such label spoofing attacks.

Abstract

Machine learning (ML) malware detectors rely heavily on crowd-sourced AntiVirus (AV) labels, with platforms like VirusTotal serving as a trusted source of malware annotations. But what if attackers could manipulate these labels to classify benign software as malicious? We introduce label spoofing attacks, a new threat that contaminates crowd-sourced datasets by embedding minimal and undetectable malicious patterns into benign samples. These patterns coerce AV engines into misclassifying legitimate files as harmful, enabling poisoning attacks against ML-based malware classifiers trained on those data. We demonstrate this scenario by developing AndroVenom, a methodology for polluting realistic data sources, causing consequent poisoning attacks against ML malware detectors. Experiments show that not only state-of-the-art feature extractors are unable to filter such injection, but also various ML models experience Denial of Service already with 1% poisoned samples. Additionally, attackers can flip decisions of specific unaltered benign samples by modifying only 0.015% of the training data, threatening their reputation and market share and being unable to be stopped by anomaly detectors on training data. We conclude our manuscript by raising the alarm on the trustworthiness of the training process based on AV annotations, requiring further investigation on how to produce proper labels for ML malware detectors.

Paper Structure

This paper contains 13 sections, 2 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Workflow of AndroVenom: (1) the attacker injects a malicious file into benign APKs and repacks them; (2) the attacker submits the modified benign APKs to VirusTotal, where they are now mislabelled as malware; (3) these mislabelled APKs are included in training sets of ML Android malware detectors; (4) the detectors are compromised after training.
  • Figure 2: ROC curves of three classifiers (LSVM, GBT, NN) using Drebin features on the small-scale and large-scale test.
  • Figure 3: ROC curve of RF using MaMaDroid features on the large-scale test.
  • Figure 4: ROC curves of LSVM, GBT, NN using Drebin features on the small-scale and large-scale test with the poisoning ratio 1%, 10%, and 20%.
  • Figure 5: ROC curves of RF using MaMaDroid features on the large-scale test with the poisoning ratio 1%, 10%, and 20%.
  • ...and 2 more figures