Table of Contents
Fetching ...

MASKDROID: Robust Android Malware Detection with Masked Graph Representations

Jingnan Zheng, Jiaohao Liu, An Zhang, Jun Zeng, Ziqi Yang, Zhenkai Liang, Tat-Seng Chua

TL;DR

MASKDROID addresses the vulnerability of graph-based Android malware detectors to adversarial perturbations by introducing a self-supervised graph reconstruction task and a proxy-based contrastive module. By masking a large portion of the input graph and reconstructing it from a small observed subset, the model learns stable representations of malicious behavior; the two-class anchors further sharpen discriminative power. Across extensive experiments on AndroZoo data (2016–2020), MASKDROID delivers superior robustness against white-box and black-box attacks while maintaining competitive detection accuracy compared to state-of-the-art detectors, and ablations confirm the contribution of each component. The approach offers a practical path toward robust, graph-based malware detection with scalable efficiency and strong defense against adversarial evasion.

Abstract

Android malware attacks have posed a severe threat to mobile users, necessitating a significant demand for the automated detection system. Among the various tools employed in malware detection, graph representations (e.g., function call graphs) have played a pivotal role in characterizing the behaviors of Android apps. However, though achieving impressive performance in malware detection, current state-of-the-art graph-based malware detectors are vulnerable to adversarial examples. These adversarial examples are meticulously crafted by introducing specific perturbations to normal malicious inputs. To defend against adversarial attacks, existing defensive mechanisms are typically supplementary additions to detectors and exhibit significant limitations, often relying on prior knowledge of adversarial examples and failing to defend against unseen types of attacks effectively. In this paper, we propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware and remarkable robustness against adversarial attacks. Specifically, we introduce a masking mechanism into the Graph Neural Network (GNN) based framework, forcing MASKDROID to recover the whole input graph using a small portion (e.g., 20%) of randomly selected nodes.This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks. While capturing stable malicious semantics in the form of dependencies inside the graph structures, we further employ a contrastive module to encourage MASKDROID to learn more compact representations for both the benign and malicious classes to boost its discriminative power in detecting malware from benign apps and adversarial examples.

MASKDROID: Robust Android Malware Detection with Masked Graph Representations

TL;DR

MASKDROID addresses the vulnerability of graph-based Android malware detectors to adversarial perturbations by introducing a self-supervised graph reconstruction task and a proxy-based contrastive module. By masking a large portion of the input graph and reconstructing it from a small observed subset, the model learns stable representations of malicious behavior; the two-class anchors further sharpen discriminative power. Across extensive experiments on AndroZoo data (2016–2020), MASKDROID delivers superior robustness against white-box and black-box attacks while maintaining competitive detection accuracy compared to state-of-the-art detectors, and ablations confirm the contribution of each component. The approach offers a practical path toward robust, graph-based malware detection with scalable efficiency and strong defense against adversarial evasion.

Abstract

Android malware attacks have posed a severe threat to mobile users, necessitating a significant demand for the automated detection system. Among the various tools employed in malware detection, graph representations (e.g., function call graphs) have played a pivotal role in characterizing the behaviors of Android apps. However, though achieving impressive performance in malware detection, current state-of-the-art graph-based malware detectors are vulnerable to adversarial examples. These adversarial examples are meticulously crafted by introducing specific perturbations to normal malicious inputs. To defend against adversarial attacks, existing defensive mechanisms are typically supplementary additions to detectors and exhibit significant limitations, often relying on prior knowledge of adversarial examples and failing to defend against unseen types of attacks effectively. In this paper, we propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware and remarkable robustness against adversarial attacks. Specifically, we introduce a masking mechanism into the Graph Neural Network (GNN) based framework, forcing MASKDROID to recover the whole input graph using a small portion (e.g., 20%) of randomly selected nodes.This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks. While capturing stable malicious semantics in the form of dependencies inside the graph structures, we further employ a contrastive module to encourage MASKDROID to learn more compact representations for both the benign and malicious classes to boost its discriminative power in detecting malware from benign apps and adversarial examples.
Paper Structure (28 sections, 9 equations, 6 figures, 7 tables)

This paper contains 28 sections, 9 equations, 6 figures, 7 tables.

Figures (6)

  • Figure 1: Partial Function Call Graph (FCG) of an app.
  • Figure 2: The framework of MaskDroid. The training phase comprises two modules. The upper dashed-line bracket represents the self-supervised reconstruction module, while the lower bracket represents the proxy-based contrastive learning module.
  • Figure 3: Robustness Evaluation against White-Box Adversarial Attacks (mask rate $\gamma$= 0.8).
  • Figure 4: Framework of the model MaskDroid-cr that disables both the contrastive module and the reconstruction module fromMaskDroid. The input graph goes through a two-layer GNN encoder, proceeds with a readout layer, and is then fed into an MLP classifier.
  • Figure 5: Ablation study on reconstruction/contrastive modules and mask rate $\gamma$ for MaskDroid's robustness against white-box adversarial attacks. The left subfigure presents the results of the reconstruction/contrastive modules, while the right subfigure illustrates the impact of the mask rate $\gamma$.
  • ...and 1 more figures