Test Script Intention Generation for Mobile Application via GUI Image and Code Understanding
Shengcheng Yu, Chunrong Fang, Jia Liu, Zhenyu Chen
TL;DR
This paper tackles the challenge of understanding mobile GUI test scripts, which are loosely tied to app logic and often poorly documented. It introduces TestIntention, a two-component framework that combines GUIIntention (textual extraction, OCR, and widget-image captioning) and CodeIntention (response-method mapping and code-based intention generation) to infer per-operation and overall test-script intentions from an operation-sequence representation. Through extensive experiments on 50 open-source Android apps and 500 test scripts, TestIntention outperforms baselines in intention generation, demonstrates strong operation-mapping capability, and shows substantial time savings for developers in a user study, with widget-image understanding and code-intention modules contributing meaningfully. The approach promises practical benefits for test script maintenance, cross-app recommendations, and test-slice optimization, and lays groundwork for broader applicability across different test-script drivers and languages.
Abstract
Testing is the most direct and effective technique to ensure software quality. Test scripts always play a more important role in mobile app testing than test cases for source code, due to the GUI-intensive and event-driven characteristics of mobile applications (app). Test scripts focus on user interactions and the corresponding response events, which is significant for testing the target app functionalities. Therefore, it is critical to understand the test scripts for better script maintenance and modification. There exist some mature code understanding (i.e., code comment generation) technologies that can be directly applied to functionality source code with business logic. However, such technologies will have difficulties when being applied to test scripts, because test scripts are loosely linked to apps under test (AUT) by widget selectors, and do not contain business logic themselves. In order to solve the test script understanding gap, this paper presents a novel approach, namely TestIntention, to infer the intention of GUI test scripts. Test intention refers to the user expectations of app behaviors for specific operations. TestIntention formalizes test scripts with an operation sequence model. For each operation within the sequence, TestIntention extracts the target widget selector and links the selector to the GUI layout information or the corresponding response events. For widgets identified by XPath, TestIntention utilizes the image understanding technologies to explore the detailed information of the widget images, the intention of which is understood with a deep learning model. For widgets identified by ID, TestIntention first maps the selectors to the response methods with business logic, and then adopts code understanding technologies to describe code in natural language form. Results of all operations are combined to generate test intention for test scripts.
