Automated Test Transfer Across Android Apps Using Large Language Models
Benyamin Beyzaei, Saghar Talebipour, Ghazal Rafiei, Nenad Medvidovic, Sam Malek
TL;DR
LLMigrates introduces a two-phase, multimodal LLM-based approach for automated transfer of usage-based UI tests across Android apps without source code access. By first abstracting source tests into a natural-language representation and then migrating them to target apps through dynamic UI analysis and LLM-guided decisions, it achieves high transfer accuracy ($\approx$ $97$–$99\%$) and substantial manual-effort reductions (>$90\%$) across diverse app domains. The evaluation on CraftDroid-based benchmarks and unseen apps demonstrates strong precision, recall, and transfer efficiency, substantially outperforming prior techniques in most settings. The work highlights practical impact for rapid test maintenance and cross-app testing while outlining limitations related to LLM behavior and UI metadata quality, and suggests avenues for broader platform support and accessibility enhancements.
Abstract
The pervasiveness of mobile apps in everyday life necessitates robust testing strategies to ensure quality and efficiency, especially through end-to-end usage-based tests for mobile apps' user interfaces (UIs). However, manually creating and maintaining such tests can be costly for developers. Since many apps share similar functionalities beneath diverse UIs, previous works have shown the possibility of transferring UI tests across different apps within the same domain, thereby eliminating the need for writing the tests manually. However, these methods have struggled to accommodate real-world variations, often facing limitations in scenarios where source and target apps are not very similar or fail to accurately transfer test oracles. This paper introduces an innovative technique, LLMigrate, which leverages Large Language Models (LLMs) to efficiently transfer usage-based UI tests across mobile apps. Our experimental evaluation shows LLMigrate can achieve a 97.5% success rate in automated test transfer, reducing the manual effort required to write tests from scratch by 91.1%. This represents an improvement of 9.1% in success rate and 38.2% in effort reduction compared to the best-performing prior technique, setting a new benchmark for automated test transfer.
