Cocoa: Co-Planning and Co-Execution with AI Agents
K. J. Kevin Feng, Kevin Pu, Matt Latzke, Tal August, Pao Siangliulue, Jonathan Bragg, Daniel S. Weld, Amy X. Zhang, Joseph Chee Chang
TL;DR
Cocoa tackles the bottleneck of limited human-AI collaboration in planning by embedding an AI agent inside a document editor and introducing interactive plans for co-planning and co-execution. It leverages a notebook-inspired metaphor to allow users and agents to collaboratively draft, edit, and run plan steps, with flexible assignment of steps between human and machine and interleaved execution. Across a formative study, a lab study with 16 researchers, and a 7-day field deployment, Cocoa demonstrates enhanced agent steerability without sacrificing ease of use, and reveals task-dependent preferences for interactive planning versus chat-based interfaces. The work highlights practical design considerations for cost-aware, transparent, and collaborative AI assistants in scientific workflows and outlines a roadmap for future multimodal, collaborative in-document AI systems.
Abstract
Human collaboration benefits from continuous coordination -- planning, delegating tasks, sharing progress, and adjusting objectives -- to align on shared goals. However, agentic AI systems often limit users to previewing or reviewing an agent's plans for fully autonomous execution. While this may be useful for confirmation and correction, it does not support deeper collaboration between humans and AI agents. We present Cocoa, a system that introduces a novel design pattern -- interactive plans -- for collaborating with an AI agent on complex, multi-step tasks. Informed by a formative study ($n=9$), Cocoa builds on interaction designs from computational notebooks and document editors to support flexible delegation of agency through Co-planning and Co-execution, where users collaboratively compose and execute plans with an Agent. Using scientific research as a sample domain, our lab (n=16) and field deployment (n=7) studies found that Cocoa improved agent steerability without sacrificing ease-of-use compared to a strong chat baseline. Additionally, researchers valued Cocoa for real-world projects and saw the interleaving of co-planning and co-execution as an effective novel paradigm for human-AI collaboration.
