Prompt2Task: Automating UI Tasks on Smartphones from Textual Prompts
Tian Huang, Chun Yu, Weinan Shi, Zijian Peng, David Yang, Weiqi Sun, Yuanchun Shi
TL;DR
Prompt2Task tackles the barrier to wide adoption of UI task automation by converting unrestricted textual prompts into smartphone UI actions through a three-stage pipeline and a cooperative multi-agent system. It combines information collection, instruction generation, and operation mapping with a rich, accumulated knowledge base (Historical Task Repository, Context Library, Instruction Set, Mobile Interaction Graph) and a human-in-the-loop to minimize intervention while maximizing reliability. The system demonstrates strong gains over baselines, achieving a rise from $22.28\%$ to $95.24\%$ task success on 2,500 prompts and enabling low-intervention automation (≈$0.69$ interventions per new task), with user studies confirming usability and effectiveness across skilled and unskilled users. Open-ended knowledge and continuous learning enable Prompt2Task to scale across apps and tasks, with practical impact in tutorial generation, smart assistance, and customer support while outlining clear directions for handling GUI dynamics and reducing latency in future work.
Abstract
UI task automation enables efficient task execution by simulating human interactions with graphical user interfaces (GUIs), without modifying the existing application code. However, its broader adoption is constrained by the need for expertise in both scripting languages and workflow design. To address this challenge, we present Prompt2Task, a system designed to comprehend various task-related textual prompts (e.g., goals, procedures), thereby generating and performing the corresponding automation tasks. Prompt2Task incorporates a suite of intelligent agents that mimic human cognitive functions, specializing in interpreting user intent, managing external information for task generation, and executing operations on smartphones. The agents can learn from user feedback and continuously improve their performance based on the accumulated knowledge. Experimental results indicated a performance jump from a 22.28\% success rate in the baseline to 95.24\% with Prompt2Task, requiring an average of 0.69 user interventions for each new task. Prompt2Task presents promising applications in fields such as tutorial creation, smart assistance, and customer service.
