MASKDROID: Robust Android Malware Detection with Masked Graph Representations
Jingnan Zheng, Jiaohao Liu, An Zhang, Jun Zeng, Ziqi Yang, Zhenkai Liang, Tat-Seng Chua
TL;DR
MASKDROID addresses the vulnerability of graph-based Android malware detectors to adversarial perturbations by introducing a self-supervised graph reconstruction task and a proxy-based contrastive module. By masking a large portion of the input graph and reconstructing it from a small observed subset, the model learns stable representations of malicious behavior; the two-class anchors further sharpen discriminative power. Across extensive experiments on AndroZoo data (2016–2020), MASKDROID delivers superior robustness against white-box and black-box attacks while maintaining competitive detection accuracy compared to state-of-the-art detectors, and ablations confirm the contribution of each component. The approach offers a practical path toward robust, graph-based malware detection with scalable efficiency and strong defense against adversarial evasion.
Abstract
Android malware attacks have posed a severe threat to mobile users, necessitating a significant demand for the automated detection system. Among the various tools employed in malware detection, graph representations (e.g., function call graphs) have played a pivotal role in characterizing the behaviors of Android apps. However, though achieving impressive performance in malware detection, current state-of-the-art graph-based malware detectors are vulnerable to adversarial examples. These adversarial examples are meticulously crafted by introducing specific perturbations to normal malicious inputs. To defend against adversarial attacks, existing defensive mechanisms are typically supplementary additions to detectors and exhibit significant limitations, often relying on prior knowledge of adversarial examples and failing to defend against unseen types of attacks effectively. In this paper, we propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware and remarkable robustness against adversarial attacks. Specifically, we introduce a masking mechanism into the Graph Neural Network (GNN) based framework, forcing MASKDROID to recover the whole input graph using a small portion (e.g., 20%) of randomly selected nodes.This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks. While capturing stable malicious semantics in the form of dependencies inside the graph structures, we further employ a contrastive module to encourage MASKDROID to learn more compact representations for both the benign and malicious classes to boost its discriminative power in detecting malware from benign apps and adversarial examples.
