Table of Contents
Fetching ...

Cocoa: Co-Planning and Co-Execution with AI Agents

K. J. Kevin Feng, Kevin Pu, Matt Latzke, Tal August, Pao Siangliulue, Jonathan Bragg, Daniel S. Weld, Amy X. Zhang, Joseph Chee Chang

TL;DR

Cocoa tackles the bottleneck of limited human-AI collaboration in planning by embedding an AI agent inside a document editor and introducing interactive plans for co-planning and co-execution. It leverages a notebook-inspired metaphor to allow users and agents to collaboratively draft, edit, and run plan steps, with flexible assignment of steps between human and machine and interleaved execution. Across a formative study, a lab study with 16 researchers, and a 7-day field deployment, Cocoa demonstrates enhanced agent steerability without sacrificing ease of use, and reveals task-dependent preferences for interactive planning versus chat-based interfaces. The work highlights practical design considerations for cost-aware, transparent, and collaborative AI assistants in scientific workflows and outlines a roadmap for future multimodal, collaborative in-document AI systems.

Abstract

Human collaboration benefits from continuous coordination -- planning, delegating tasks, sharing progress, and adjusting objectives -- to align on shared goals. However, agentic AI systems often limit users to previewing or reviewing an agent's plans for fully autonomous execution. While this may be useful for confirmation and correction, it does not support deeper collaboration between humans and AI agents. We present Cocoa, a system that introduces a novel design pattern -- interactive plans -- for collaborating with an AI agent on complex, multi-step tasks. Informed by a formative study ($n=9$), Cocoa builds on interaction designs from computational notebooks and document editors to support flexible delegation of agency through Co-planning and Co-execution, where users collaboratively compose and execute plans with an Agent. Using scientific research as a sample domain, our lab (n=16) and field deployment (n=7) studies found that Cocoa improved agent steerability without sacrificing ease-of-use compared to a strong chat baseline. Additionally, researchers valued Cocoa for real-world projects and saw the interleaving of co-planning and co-execution as an effective novel paradigm for human-AI collaboration.

Cocoa: Co-Planning and Co-Execution with AI Agents

TL;DR

Cocoa tackles the bottleneck of limited human-AI collaboration in planning by embedding an AI agent inside a document editor and introducing interactive plans for co-planning and co-execution. It leverages a notebook-inspired metaphor to allow users and agents to collaboratively draft, edit, and run plan steps, with flexible assignment of steps between human and machine and interleaved execution. Across a formative study, a lab study with 16 researchers, and a 7-day field deployment, Cocoa demonstrates enhanced agent steerability without sacrificing ease of use, and reveals task-dependent preferences for interactive planning versus chat-based interfaces. The work highlights practical design considerations for cost-aware, transparent, and collaborative AI assistants in scientific workflows and outlines a roadmap for future multimodal, collaborative in-document AI systems.

Abstract

Human collaboration benefits from continuous coordination -- planning, delegating tasks, sharing progress, and adjusting objectives -- to align on shared goals. However, agentic AI systems often limit users to previewing or reviewing an agent's plans for fully autonomous execution. While this may be useful for confirmation and correction, it does not support deeper collaboration between humans and AI agents. We present Cocoa, a system that introduces a novel design pattern -- interactive plans -- for collaborating with an AI agent on complex, multi-step tasks. Informed by a formative study (), Cocoa builds on interaction designs from computational notebooks and document editors to support flexible delegation of agency through Co-planning and Co-execution, where users collaboratively compose and execute plans with an Agent. Using scientific research as a sample domain, our lab (n=16) and field deployment (n=7) studies found that Cocoa improved agent steerability without sacrificing ease-of-use compared to a strong chat baseline. Additionally, researchers valued Cocoa for real-world projects and saw the interleaving of co-planning and co-execution as an effective novel paradigm for human-AI collaboration.

Paper Structure

This paper contains 77 sections, 13 figures, 2 tables.

Figures (13)

  • Figure 1: An overview of the Cocoa user interface. An interactive plan (A) affords human-AI co-planning and co-execution: a researcher and the AI agent can collaboratively edit the plan in the document and execute the plan steps, similar to executing code cells in a computational notebook. Steps can be assigned to the AI agent (B) or the researcher (C). The researcher can freely edit the AI agent's outputs in an interactive sidebar (D), such as adding relevant papers that the agent did not find (E) to help steer the agent with their feedback and expertise. In this example, the first three steps of a plan to summarize methods for human feedback elicitation have already been executed, and the agent is requesting guidance from the user in the next step.
  • Figure 2: A user invokes the agent on a piece of text in the document by clicking on the "Invoke agent’’ button that appears on hover whenever text is highlighted. The agent will use the highlighted text and context from elsewhere in the document to propose a series of plans, displayed in a plan selector that appears under the highlighted text. Once the user selects a plan, it becomes fully interactive in the document.
  • Figure 3: For each plan step, the user can assign the step to either themselves (a user step) or the agent (an agent step) by using the step assignment toggle, as well as edit the step’s description.
  • Figure 4: Highlighting text within the step description will bring up an option for the agent to suggest an alternate step. The user can undo the suggestion or save to accept it.
  • Figure 5: If major changes are made to a step that changes the rest of the plan’s trajectory, Cocoa detects this and will trigger replanning. Replanning replaces subsequent steps with ones the agent suggested for "autocompleting’’ the plan.
  • ...and 8 more figures