Table of Contents
Fetching ...

Explainable Android Malware Detection and Malicious Code Localization Using Graph Attention

Merve Cigdem Ipek, Sevil Sen

TL;DR

XAIDroid presents an explainable approach to Android malware analysis by modeling apps as API-call graphs and applying dual graph attention (GAM for subgraph focus and GAT for edge-level attention) to localize malicious code at both class and method levels. The method leverages a diverse dataset, combines static API-call graph representations with attention-based localization, and demonstrates high recall and competitive F1-scores on synthetic Mystique data as well as real-world AMD/CICMalDroid samples. Key contributions include fine-grained MCL without pre-labeled class data, transparent attention-driven explanations of identified malicious segments, and robust malware detection performance through GAM+GAT ensembles. The work advances scalable, interpretable malware analysis with practical impact for security analysts and automated reporting, while acknowledging limitations in static analysis coverage and potential adversarial manipulations.

Abstract

With the escalating threat of malware, particularly on mobile devices, the demand for effective analysis methods has never been higher. While existing security solutions, including AI-based approaches, offer promise, their lack of transparency constraints the understanding of detected threats. Manual analysis remains time-consuming and reliant on scarce expertise. To address these challenges, we propose a novel approach called XAIDroid that leverages graph neural networks (GNNs) and graph attention mechanisms for automatically locating malicious code snippets within malware. By representing code as API call graphs, XAIDroid captures semantic context and enhances resilience against obfuscation. Utilizing the Graph Attention Model (GAM) and Graph Attention Network (GAT), we assign importance scores to API nodes, facilitating focused attention on critical information for malicious code localization. Evaluation on synthetic and real-world malware datasets demonstrates the efficacy of our approach, achieving high recall and F1-score rates for malicious code localization. The successful implementation of automatic malicious code localization enhances the scalability, interpretability, and reliability of malware analysis.

Explainable Android Malware Detection and Malicious Code Localization Using Graph Attention

TL;DR

XAIDroid presents an explainable approach to Android malware analysis by modeling apps as API-call graphs and applying dual graph attention (GAM for subgraph focus and GAT for edge-level attention) to localize malicious code at both class and method levels. The method leverages a diverse dataset, combines static API-call graph representations with attention-based localization, and demonstrates high recall and competitive F1-scores on synthetic Mystique data as well as real-world AMD/CICMalDroid samples. Key contributions include fine-grained MCL without pre-labeled class data, transparent attention-driven explanations of identified malicious segments, and robust malware detection performance through GAM+GAT ensembles. The work advances scalable, interpretable malware analysis with practical impact for security analysts and automated reporting, while acknowledging limitations in static analysis coverage and potential adversarial manipulations.

Abstract

With the escalating threat of malware, particularly on mobile devices, the demand for effective analysis methods has never been higher. While existing security solutions, including AI-based approaches, offer promise, their lack of transparency constraints the understanding of detected threats. Manual analysis remains time-consuming and reliant on scarce expertise. To address these challenges, we propose a novel approach called XAIDroid that leverages graph neural networks (GNNs) and graph attention mechanisms for automatically locating malicious code snippets within malware. By representing code as API call graphs, XAIDroid captures semantic context and enhances resilience against obfuscation. Utilizing the Graph Attention Model (GAM) and Graph Attention Network (GAT), we assign importance scores to API nodes, facilitating focused attention on critical information for malicious code localization. Evaluation on synthetic and real-world malware datasets demonstrates the efficacy of our approach, achieving high recall and F1-score rates for malicious code localization. The successful implementation of automatic malicious code localization enhances the scalability, interpretability, and reliability of malware analysis.

Paper Structure

This paper contains 31 sections, 12 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: XAIDroid Conceptual Diagram
  • Figure 2: API Call Graph of the Smali Code given in Table \ref{['tab:smali']}
  • Figure 3: Method Level MCL - Recall and F1-Score Metrics