Table of Contents
Fetching ...

"When to Hand Off, When to Work Together": Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction

Kihoon Son, Hyewon Lee, DaEun Choi, Yoonsu Kim, Tae Soo Kim, Yoonjoo Lee, John Joon Young Chung, HyunJoon Jung, Juho Kim

TL;DR

This work developed CLEO, which interprets collaborative intent and adapts in real-time, and presents a decision model with six interaction loops, design implications, and an annotated dataset.

Abstract

Human collaborators coordinate dynamically through process visibility and workspace awareness, yet AI agents typically either provide only final outputs or expose read-only execution processes (e.g., planning, reasoning) without interpreting concurrent user actions on shared artifacts. Building on mixed-initiative interaction principles, we explore whether agents can achieve collaborative context awareness -- interpreting concurrent user actions on shared artifacts and adapting in real-time. Study 1 (N=10 professional designers) revealed that process visibility enabled reasoning about agent actions but exposed conflicts when agents could not distinguish feedback from independent work. We developed CLEO, which interprets collaborative intent and adapts in real-time. Study 2 (N=10, two-day with stimulated recall interviews) analyzed 214 turns, identifying five action patterns, six triggers, and four enabling factors explaining when designers choose delegation (70.1%), direction (28.5%), or concurrent work (31.8%). We present a decision model with six interaction loops, design implications, and an annotated dataset.

"When to Hand Off, When to Work Together": Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction

TL;DR

This work developed CLEO, which interprets collaborative intent and adapts in real-time, and presents a decision model with six interaction loops, design implications, and an annotated dataset.

Abstract

Human collaborators coordinate dynamically through process visibility and workspace awareness, yet AI agents typically either provide only final outputs or expose read-only execution processes (e.g., planning, reasoning) without interpreting concurrent user actions on shared artifacts. Building on mixed-initiative interaction principles, we explore whether agents can achieve collaborative context awareness -- interpreting concurrent user actions on shared artifacts and adapting in real-time. Study 1 (N=10 professional designers) revealed that process visibility enabled reasoning about agent actions but exposed conflicts when agents could not distinguish feedback from independent work. We developed CLEO, which interprets collaborative intent and adapts in real-time. Study 2 (N=10, two-day with stimulated recall interviews) analyzed 214 turns, identifying five action patterns, six triggers, and four enabling factors explaining when designers choose delegation (70.1%), direction (28.5%), or concurrent work (31.8%). We present a decision model with six interaction loops, design implications, and an annotated dataset.
Paper Structure (70 sections, 11 figures, 1 table)

This paper contains 70 sections, 11 figures, 1 table.

Figures (11)

  • Figure 1: Interaction flow of the first probe system. (a) The user clicks the record button to deliver a voice message (b) while selecting canvas elements for context. The transcribed message with selection changes is sent to the agent as input. The agent generates an action plan and iterates through cycles of (1) reason, (2) act, and (3) feedback. (d-f) Each iteration's progress is reflected immediately on the canvas as tools execute during the act stage, with the agent character floating near modified elements and displaying interim messages.
  • Figure 2: Agent structure of the first probe system. Upon receiving user input, the Plan Module generates an action plan. The ReAct Agent then iterates through a cycle of reasoning with tool selection, execution, summary, and feedback evaluation until the plan is fulfilled or the maximum of 10 iterations is reached. The Message Module generates the final response, reflecting both the agent's actions and the user's original message, and delivers it to the user.
  • Figure 3: Flexible co-creation scenario with Cleo. (a) The user invokes the agent by calling its name. Cleo allows users to (b) abort the agent's operation, (c) work concurrently with the agent, or (d) intervene directly with additional instructions at any time.
  • Figure 4: Agent structure of Cleo. Three modules have been updated from the first probe pipeline: User Change Detection Module, Attribution Change Module, and Plan Update Module.
  • Figure 5: Visualization of interaction logs. (a) The overall task process is segmented into periods when the agent is activated and deactivated, using the timestamps of agent task initiation (triggered by user requests) and task completion. (b) Each segment is visualized with labeled interaction events. (c) Example of a visualized segment during agent activation. (d) Example of a visualized segment during agent deactivation.
  • ...and 6 more figures