Bamboo: LLM-Driven Discovery of API-Permission Mappings in the Android Framework
Han Hu, Wei Minn, Yonghui Liu, Jiakun Liu, Ferdian Thung, Terry Yue Zhuo, Lwin Khin Shar, Debin Gao, David Lo
TL;DR
This work tackles the challenge of incomplete and ambiguous Android API-permission mappings by introducing Bamboo, an LLM-driven three-phase pipeline. It combines static SDK API extraction, dual-role LLM analysis to infer required permissions, and API-driven test-case generation with emulator-based verification to validate mappings. Empirical results show Bamboo substantially outperforms state-of-the-art baselines across multiple Android versions, and it also reveals major gaps and inconsistencies in official SDK documentation. The study further analyzes how permission mappings evolve across framework releases, offering actionable insights for developers and security analysts to maintain accurate permission declarations and robust security postures.
Abstract
The permission mechanism in the Android Framework is integral to safeguarding the privacy of users by managing users' and processes' access to sensitive resources and operations. As such, developers need to be equipped with an in-depth understanding of API permissions to build robust Android apps. Unfortunately, the official API documentation by Android chronically suffers from imprecision and incompleteness, causing developers to spend significant effort to accurately discern necessary permissions. This potentially leads to incorrect permission declarations in Android app development, potentially resulting in security violations and app failures. Recent efforts in improving permission specification primarily leverage static and dynamic code analyses to uncover API-permission mappings within the Android framework. Yet, these methodologies encounter substantial shortcomings, including poor adaptability to Android SDK and Framework updates, restricted code coverage, and a propensity to overlook essential API-permission mappings in intricate codebases. This paper introduces a pioneering approach utilizing large language models (LLMs) for a systematic examination of API-permission mappings. In addition to employing LLMs, we integrate a dual-role prompting strategy and an API-driven code generation approach into our mapping discovery pipeline, resulting in the development of the corresponding tool, \tool{}. We formulate three research questions to evaluate the efficacy of \tool{} against state-of-the-art baselines, assess the completeness of official SDK documentation, and analyze the evolution of permission-required APIs across different SDK releases. Our experimental results reveal that \tool{} identifies 2,234, 3,552, and 4,576 API-permission mappings in Android versions 6, 7, and 10 respectively, substantially outprforming existing baselines.
