Table of Contents
Fetching ...

LAMD: Context-driven Android Malware Detection and Classification with LLMs

Xingzhi Qian, Xinran Zheng, Yiling He, Shuo Yang, Lorenzo Cavallaro

TL;DR

LAMD presents a context-driven framework for Android malware detection that overcomes LLM context limits and structural complexity by extracting key contextual signals and applying tier-wise code reasoning. It couples static analysis-derived key context with backward slicing to generate compact, semantically rich representations fed into a three-tier LLM reasoning pipeline, guarded by a factual consistency verifier using Data Relationship Coverage. Across real-world datasets exhibiting distribution drift, LAMD outperforms conventional detectors in detection accuracy and provides interpretable explanations, albeit with higher computational cost. The work demonstrates the feasibility of scalable, explainable LLM-powered malware analysis and outlines future directions for hybridizing LLMs with traditional detectors to handle evolving Android threats. Overall, LAMD advances practical AI-driven malware analysis by balancing zero-shot reasoning with rigorous context-aware verification, enabling robust, interpretable defenses in dynamic mobile threat landscapes.

Abstract

The rapid growth of mobile applications has escalated Android malware threats. Although there are numerous detection methods, they often struggle with evolving attacks, dataset biases, and limited explainability. Large Language Models (LLMs) offer a promising alternative with their zero-shot inference and reasoning capabilities. However, applying LLMs to Android malware detection presents two key challenges: (1)the extensive support code in Android applications, often spanning thousands of classes, exceeds LLMs' context limits and obscures malicious behavior within benign functionality; (2)the structural complexity and interdependencies of Android applications surpass LLMs' sequence-based reasoning, fragmenting code analysis and hindering malicious intent inference. To address these challenges, we propose LAMD, a practical context-driven framework to enable LLM-based Android malware detection. LAMD integrates key context extraction to isolate security-critical code regions and construct program structures, then applies tier-wise code reasoning to analyze application behavior progressively, from low-level instructions to high-level semantics, providing final prediction and explanation. A well-designed factual consistency verification mechanism is equipped to mitigate LLM hallucinations from the first tier. Evaluation in real-world settings demonstrates LAMD's effectiveness over conventional detectors, establishing a feasible basis for LLM-driven malware analysis in dynamic threat landscapes.

LAMD: Context-driven Android Malware Detection and Classification with LLMs

TL;DR

LAMD presents a context-driven framework for Android malware detection that overcomes LLM context limits and structural complexity by extracting key contextual signals and applying tier-wise code reasoning. It couples static analysis-derived key context with backward slicing to generate compact, semantically rich representations fed into a three-tier LLM reasoning pipeline, guarded by a factual consistency verifier using Data Relationship Coverage. Across real-world datasets exhibiting distribution drift, LAMD outperforms conventional detectors in detection accuracy and provides interpretable explanations, albeit with higher computational cost. The work demonstrates the feasibility of scalable, explainable LLM-powered malware analysis and outlines future directions for hybridizing LLMs with traditional detectors to handle evolving Android threats. Overall, LAMD advances practical AI-driven malware analysis by balancing zero-shot reasoning with rigorous context-aware verification, enabling robust, interpretable defenses in dynamic mobile threat landscapes.

Abstract

The rapid growth of mobile applications has escalated Android malware threats. Although there are numerous detection methods, they often struggle with evolving attacks, dataset biases, and limited explainability. Large Language Models (LLMs) offer a promising alternative with their zero-shot inference and reasoning capabilities. However, applying LLMs to Android malware detection presents two key challenges: (1)the extensive support code in Android applications, often spanning thousands of classes, exceeds LLMs' context limits and obscures malicious behavior within benign functionality; (2)the structural complexity and interdependencies of Android applications surpass LLMs' sequence-based reasoning, fragmenting code analysis and hindering malicious intent inference. To address these challenges, we propose LAMD, a practical context-driven framework to enable LLM-based Android malware detection. LAMD integrates key context extraction to isolate security-critical code regions and construct program structures, then applies tier-wise code reasoning to analyze application behavior progressively, from low-level instructions to high-level semantics, providing final prediction and explanation. A well-designed factual consistency verification mechanism is equipped to mitigate LLM hallucinations from the first tier. Evaluation in real-world settings demonstrates LAMD's effectiveness over conventional detectors, establishing a feasible basis for LLM-driven malware analysis in dynamic threat landscapes.

Paper Structure

This paper contains 29 sections, 1 equation, 2 figures, 7 tables, 1 algorithm.

Figures (2)

  • Figure 1: Two failure cases of applying LLMs to Android malware detection: (1) context window limitations and (2) failure to capture malicious intent. LAMD addresses these challenges by extracting key contexts and tiered reasoning to capture structures and semantics efficiently.
  • Figure 2: The workflow of LAMD. Suspicious APIs are identified via predefined rules (Step 1), and their calling functions with control flow graphs are extracted through static analysis. A customized backward slicing technique refines relevant instructions, preserving potential malicious intent (Step 2). In the code reasoning phase, the structured control flow graph, function relationships, and suspicious APIs form hierarchical tiers for malware detection and human-readable explanations (Steps 3-6). Factual consistency verification ensures first-tier summary reliability, mitigating hallucination (Step 4).