ReuseDroid: A VLM-empowered Android UI Test Migrator Boosted by Active Feedback
Xiaolei Li, Jialun Cao, Yepang Liu, Shing-Chi Cheung, Hailong Wang
TL;DR
This paper tackles the challenge of migrating Android GUI tests when the source and target apps encode different operational logic. It introduces ReuseDroid, a multi-agent framework powered by Large Vision-Language Models that operates offline to distill a test skeleton (target functionality, key steps, stop condition) and online to explore the target app using visual context and iterative feedback. The Test Analyzer, Planner, Execution, and Feedback agents work together to eliminate redundant steps, infer transferable logic, and correct missteps through a constraint-aware, vision-enabled exploration loop. Evaluated on LinPro, ReuseDroid achieves up to 90.3% migration success, significantly outperforming mapping-based and other LLM-based baselines, demonstrating strong potential for scalable GUI test migration in real-world app ecosystems.
Abstract
GUI testing is an essential quality assurance process in mobile app development. However, the creation and maintenance of GUI tests for mobile apps are resource-intensive and costly. Recognizing that many apps share similar functionalities, researchers have proposed various techniques to migrate GUI tests from one app to another with similar features. For example, some techniques employ mapping-based approaches to align the GUI elements traversed by the tests of a source app to those present in the target app. Other test migration techniques have also been proposed to leverage large language models (LLMs) by adapting the GUI tasks in source tests. However, these techniques are ineffective in dealing with different operational logic between the source and target apps. The semantics of GUI elements may not be correctly inferred due to the missing analysis of these flows. In this work, we propose REUSEDROID, a novel multiagent framework for GUI test migration empowered by Large Vision-Language Models (VLMs). REUSEDROID is powered by multiple VLM-based agents, each tackling a stage of the test migration process by leveraging the relevant visual and textual information embedded in GUI pages. An insight of REUSEDROID is to migrate tests based only on the core logic shared across similar apps, while their entire operational logic could differ. We evaluate REUSEDROID on LinPro, a new test migration dataset that consists of 578 migration tasks for 39 popular apps across 4 categories. The experimental result shows that REUSEDROID can successfully migrate 90.3% of the migration tasks, outperforming the best mapping-based and LLM-based baselines by 318.1% and 109.1%, respectively.
