Table of Contents
Fetching ...

DuetUI: A Bidirectional Context Loop for Human-Agent Co-Generation of Task-Oriented Interfaces

Yuan Xu, Shaowen Xiang, Yizhi Song, Ruoting Sun, Xin Tong

TL;DR

DuetUI introduces a bidirectional context loop for human–agent co-generation of task-oriented interfaces, addressing end-user needs for fluid collaboration with LLM-driven GUI agents. It operationalizes a six-stage co-generation process and four features (staged generation, tangible agency, task–interface duality, bidirectional history) within a three-layer architecture that supports two coupled loops (Task and Interface). Technical and user studies show improved task efficiency, higher usability, and stronger perceived alignment with user intent, despite trade-offs in precision under expanded context. The work demonstrates a practical, end-user–friendly path toward interactive AI that learns from ongoing user interactions and supports dynamic autonomy, with implications for more trustworthy and adaptable AI-enabled interfaces.

Abstract

Large Language Models are reshaping task automation, yet remain limited in complex, multi-step real-world tasks that require aligning with vague user intent and enabling dynamic user override. From a formative study with 12 participants, we found that end-users actively seek to shape task-oriented interfaces rather than relying on one-shot outputs. To address this, we introduce the human-agent co-generation paradigm, materialized in DuetUI. This LLM-empowered system unfolds alongside task progress through a bidirectional context loop-the agent scaffolds the interface by decomposing the task, while the user's direct manipulations implicitly steer the agent's next generation step. In a technical ablation study and a user study with 24 participants, DuetUI improved task efficiency and interface usability, supporting more seamless human-agent collaboration. Our contributions include the proposal of this novel paradigm, the design of a proof-of-concept DuetUI prototype embodying it, and empirical and technical insights from an initial evaluation of how this bidirectional loop may help align agents with human intent and inform future development.

DuetUI: A Bidirectional Context Loop for Human-Agent Co-Generation of Task-Oriented Interfaces

TL;DR

DuetUI introduces a bidirectional context loop for human–agent co-generation of task-oriented interfaces, addressing end-user needs for fluid collaboration with LLM-driven GUI agents. It operationalizes a six-stage co-generation process and four features (staged generation, tangible agency, task–interface duality, bidirectional history) within a three-layer architecture that supports two coupled loops (Task and Interface). Technical and user studies show improved task efficiency, higher usability, and stronger perceived alignment with user intent, despite trade-offs in precision under expanded context. The work demonstrates a practical, end-user–friendly path toward interactive AI that learns from ongoing user interactions and supports dynamic autonomy, with implications for more trustworthy and adaptable AI-enabled interfaces.

Abstract

Large Language Models are reshaping task automation, yet remain limited in complex, multi-step real-world tasks that require aligning with vague user intent and enabling dynamic user override. From a formative study with 12 participants, we found that end-users actively seek to shape task-oriented interfaces rather than relying on one-shot outputs. To address this, we introduce the human-agent co-generation paradigm, materialized in DuetUI. This LLM-empowered system unfolds alongside task progress through a bidirectional context loop-the agent scaffolds the interface by decomposing the task, while the user's direct manipulations implicitly steer the agent's next generation step. In a technical ablation study and a user study with 24 participants, DuetUI improved task efficiency and interface usability, supporting more seamless human-agent collaboration. Our contributions include the proposal of this novel paradigm, the design of a proof-of-concept DuetUI prototype embodying it, and empirical and technical insights from an initial evaluation of how this bidirectional loop may help align agents with human intent and inform future development.

Paper Structure

This paper contains 66 sections, 15 figures, 6 tables.

Figures (15)

  • Figure 1: The Evolution of Task Automation Paradigms. (a) Traditional direct manipulation by the user. (b) Agent-centric full automation, which excludes the user. (c-d) Recent human-in-the-loop approaches that treat the user as a supervisor. (e) Our proposed bidirectional co-generation paradigm, enabling a seamless human-agent partnership.
  • Figure 2: The suite of UI components that provide Tangible Agency, allowing users to directly manipulate the agent's capabilities.
  • Figure 3: An example of the Bidirectional Action History. The log shows a sequence of agent actions and user actions, enabling the agent to infer intent and collaborate effectively.
  • Figure 4: Define
  • Figure 5: Emphasize
  • ...and 10 more figures