DuetUI: A Bidirectional Context Loop for Human-Agent Co-Generation of Task-Oriented Interfaces
Yuan Xu, Shaowen Xiang, Yizhi Song, Ruoting Sun, Xin Tong
TL;DR
DuetUI introduces a bidirectional context loop for human–agent co-generation of task-oriented interfaces, addressing end-user needs for fluid collaboration with LLM-driven GUI agents. It operationalizes a six-stage co-generation process and four features (staged generation, tangible agency, task–interface duality, bidirectional history) within a three-layer architecture that supports two coupled loops (Task and Interface). Technical and user studies show improved task efficiency, higher usability, and stronger perceived alignment with user intent, despite trade-offs in precision under expanded context. The work demonstrates a practical, end-user–friendly path toward interactive AI that learns from ongoing user interactions and supports dynamic autonomy, with implications for more trustworthy and adaptable AI-enabled interfaces.
Abstract
Large Language Models are reshaping task automation, yet remain limited in complex, multi-step real-world tasks that require aligning with vague user intent and enabling dynamic user override. From a formative study with 12 participants, we found that end-users actively seek to shape task-oriented interfaces rather than relying on one-shot outputs. To address this, we introduce the human-agent co-generation paradigm, materialized in DuetUI. This LLM-empowered system unfolds alongside task progress through a bidirectional context loop-the agent scaffolds the interface by decomposing the task, while the user's direct manipulations implicitly steer the agent's next generation step. In a technical ablation study and a user study with 24 participants, DuetUI improved task efficiency and interface usability, supporting more seamless human-agent collaboration. Our contributions include the proposal of this novel paradigm, the design of a proof-of-concept DuetUI prototype embodying it, and empirical and technical insights from an initial evaluation of how this bidirectional loop may help align agents with human intent and inform future development.
