Table of Contents
Fetching ...

Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance

Harang Ju, Sinan Aral

TL;DR

The paper investigates how autonomous, multimodal AI agents influence teamwork, productivity, and output quality in a real-world creative task using a large randomized experiment on the Pairit platform. By comparing human–human and human–AI teams in ad creation and validating results with a field campaign on X, it reveals that AI collaboration boosts individual productivity and text quality while decreasing image quality and output diversity, illustrating a jagged frontier of AI capabilities. Three mechanisms—task-oriented communication, delegation to AI, and recognition of AI partner identity—mediate these effects, shaping how teams allocate work and how outputs perform in the field. These findings advance theories of human–AI collaboration, offer managerial guidance for integrating AI into creative workflows, and demonstrate Pairit as a scalable platform for studying agentic collaboration in complex tasks.

Abstract

We examined the mechanisms underlying productivity and performance gains from AI agents using a large-scale experiment on Pairit, a platform we developed to study human-AI collaboration. We randomly assigned 2,234 participants to human-human and human-AI teams that produced 11,024 ads for a think tank. We evaluated the ads using independent human ratings and a field experiment on X which garnered ~5M impressions. We found human-AI teams produced 50% more ads per worker and higher text quality, while human-human teams produced higher image quality, suggesting a jagged frontier of AI agent capability. Human-AI teams also produced more homogeneous, or self-similar, outputs. The field experiment revealed higher text quality improved click-through rates and view-through duration, while higher image quality improved cost-per-click rates. We found three mechanisms explained these effects. First, human-AI collaboration was more task-oriented, with 25% more task-oriented messages and 18% fewer interpersonal messages. Second, human-AI collaboration displayed more delegation, as participants delegated 17% more work to AI agents than to human partners and performed 62% fewer direct text edits when working with AI. Third, recognition that the collaborator was an AI moderated these effects as participants who correctly identified they were working with AI were more task-oriented and more likely to delegate work. These mechanisms then explained performance as task-oriented communication improved ad quality, specifically when working with AI, while interpersonal communication reduced ad quality; delegation improved text quality but had no effect on image quality and was positively associated with diversity collapse, creating homogeneous outputs of higher average quality. The results suggest AI agents drive changes in productivity, performance, and output diversity by reshaping teamwork.

Collaborating with AI Agents: Field Experiments on Teamwork, Productivity, and Performance

TL;DR

The paper investigates how autonomous, multimodal AI agents influence teamwork, productivity, and output quality in a real-world creative task using a large randomized experiment on the Pairit platform. By comparing human–human and human–AI teams in ad creation and validating results with a field campaign on X, it reveals that AI collaboration boosts individual productivity and text quality while decreasing image quality and output diversity, illustrating a jagged frontier of AI capabilities. Three mechanisms—task-oriented communication, delegation to AI, and recognition of AI partner identity—mediate these effects, shaping how teams allocate work and how outputs perform in the field. These findings advance theories of human–AI collaboration, offer managerial guidance for integrating AI into creative workflows, and demonstrate Pairit as a scalable platform for studying agentic collaboration in complex tasks.

Abstract

We examined the mechanisms underlying productivity and performance gains from AI agents using a large-scale experiment on Pairit, a platform we developed to study human-AI collaboration. We randomly assigned 2,234 participants to human-human and human-AI teams that produced 11,024 ads for a think tank. We evaluated the ads using independent human ratings and a field experiment on X which garnered ~5M impressions. We found human-AI teams produced 50% more ads per worker and higher text quality, while human-human teams produced higher image quality, suggesting a jagged frontier of AI agent capability. Human-AI teams also produced more homogeneous, or self-similar, outputs. The field experiment revealed higher text quality improved click-through rates and view-through duration, while higher image quality improved cost-per-click rates. We found three mechanisms explained these effects. First, human-AI collaboration was more task-oriented, with 25% more task-oriented messages and 18% fewer interpersonal messages. Second, human-AI collaboration displayed more delegation, as participants delegated 17% more work to AI agents than to human partners and performed 62% fewer direct text edits when working with AI. Third, recognition that the collaborator was an AI moderated these effects as participants who correctly identified they were working with AI were more task-oriented and more likely to delegate work. These mechanisms then explained performance as task-oriented communication improved ad quality, specifically when working with AI, while interpersonal communication reduced ad quality; delegation improved text quality but had no effect on image quality and was positively associated with diversity collapse, creating homogeneous outputs of higher average quality. The results suggest AI agents drive changes in productivity, performance, and output diversity by reshaping teamwork.

Paper Structure

This paper contains 72 sections, 3 equations, 3 figures, 14 tables.

Figures (3)

  • Figure 1: The Pairit platform. On the left is the task panel, and on the right is the chat panel. In the human-human condition, chat messages and edits on the task panel (including text edits, image selections, and AI image generations) are synchronized in real-time. In the human-AI condition, the participant chats with an AI agent with full context of the user interface (UI; see Section \ref{['sec:methods:ai:context']}), and the AI can edit text, select images, and generate AI images.
  • Figure 2: Overview of methods. (A) Participants are randomized into collaborating with another participant or an AI agent. (B) Participants collaborate with another participant or an AI agent to produce ads in a real-time collaborative workspace.
  • Figure F: The user interface for the ad quality survey.