Identifying User Goals from UI Trajectories
Omri Berkovitch, Sapir Caduri, Noam Kahlon, Anatoly Efros, Avi Caciularu, Ido Dagan
TL;DR
This work introduces a novel task of identifying user goals from UI trajectories, producing natural language goal descriptions from multimodal UI traces. It formalizes evaluation via task fulfillment and satisfaction, and validates the approach using inverted UI automation datasets from web (Mind2Web) and Android (AitW/AitZ) with human and multimodal model benchmarks. Experiments reveal a substantial gap between expert humans and current models (Gemini and GPT-4o), underscoring the complexity of inferring intents from UI interactions. The study lays a foundation for future work in goal-aware agents and personalization, with potential extensions to more GUI platforms and ethical considerations for user privacy.
Abstract
Identifying underlying user goals and intents has been recognized as valuable in various personalization-oriented settings, such as personalized agents, improved search responses, advertising, user analytics, and more. In this paper, we propose a new task goal identification from observed UI trajectories aiming to infer the user's detailed intentions when performing a task within UI environments. To support this task, we also introduce a novel evaluation methodology designed to assess whether two intent descriptions can be considered paraphrases within a specific UI environment. Furthermore, we demonstrate how this task can leverage datasets designed for the inverse problem of UI automation, utilizing Android and web datasets for our experiments. To benchmark this task, we compare the performance of humans and state-of-the-art models, specifically GPT-4 and Gemini-1.5 Pro, using our proposed metric. The results reveal that both Gemini and GPT underperform relative to human performance, underscoring the challenge of the proposed task and the significant room for improvement. This work highlights the importance of goal identification within UI trajectories, providing a foundation for further exploration and advancement in this area.
