Table of Contents
Fetching ...

Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models

Paramveer S. Dhillon, Somayeh Molaei, Jiaqi Li, Maximilian Golub, Shaochun Zheng, Lionel P. Robert

TL;DR

This study investigates how varying AI scaffolding levels in co-writing with a large language model affect writing quality, productivity, and user experience. Using a within-subject, Latin-square field experiment (N=$131$) with three conditions (no AI, next-sentence, next-paragraph) and a custom GPT-3-based tool, the authors reveal a U-shaped effect: high-level paragraph scaffolding substantially improves quality and speed, especially for non-regular and less tech-savvy writers, while low-level sentence scaffolding can reduce quality and ownership. The results also show that user satisfaction and sense of authorship decline under scaffolded conditions, underscoring the need for personalized, adaptive scaffolding that preserves human agency. The findings offer concrete guidance for designing AI writing assistants that enhance output without eroding engagement, suggesting dynamic, user-aware scaffolding as a key direction for productive human-AI collaboration in writing.

Abstract

Advances in language modeling have paved the way for novel human-AI co-writing experiences. This paper explores how varying levels of scaffolding from large language models (LLMs) shape the co-writing process. Employing a within-subjects field experiment with a Latin square design, we asked participants (N=131) to respond to argumentative writing prompts under three randomly sequenced conditions: no AI assistance (control), next-sentence suggestions (low scaffolding), and next-paragraph suggestions (high scaffolding). Our findings reveal a U-shaped impact of scaffolding on writing quality and productivity (words/time). While low scaffolding did not significantly improve writing quality or productivity, high scaffolding led to significant improvements, especially benefiting non-regular writers and less tech-savvy users. No significant cognitive burden was observed while using the scaffolded writing tools, but a moderate decrease in text ownership and satisfaction was noted. Our results have broad implications for the design of AI-powered writing tools, including the need for personalized scaffolding mechanisms.

Shaping Human-AI Collaboration: Varied Scaffolding Levels in Co-writing with Language Models

TL;DR

This study investigates how varying AI scaffolding levels in co-writing with a large language model affect writing quality, productivity, and user experience. Using a within-subject, Latin-square field experiment (N=) with three conditions (no AI, next-sentence, next-paragraph) and a custom GPT-3-based tool, the authors reveal a U-shaped effect: high-level paragraph scaffolding substantially improves quality and speed, especially for non-regular and less tech-savvy writers, while low-level sentence scaffolding can reduce quality and ownership. The results also show that user satisfaction and sense of authorship decline under scaffolded conditions, underscoring the need for personalized, adaptive scaffolding that preserves human agency. The findings offer concrete guidance for designing AI writing assistants that enhance output without eroding engagement, suggesting dynamic, user-aware scaffolding as a key direction for productive human-AI collaboration in writing.

Abstract

Advances in language modeling have paved the way for novel human-AI co-writing experiences. This paper explores how varying levels of scaffolding from large language models (LLMs) shape the co-writing process. Employing a within-subjects field experiment with a Latin square design, we asked participants (N=131) to respond to argumentative writing prompts under three randomly sequenced conditions: no AI assistance (control), next-sentence suggestions (low scaffolding), and next-paragraph suggestions (high scaffolding). Our findings reveal a U-shaped impact of scaffolding on writing quality and productivity (words/time). While low scaffolding did not significantly improve writing quality or productivity, high scaffolding led to significant improvements, especially benefiting non-regular writers and less tech-savvy users. No significant cognitive burden was observed while using the scaffolded writing tools, but a moderate decrease in text ownership and satisfaction was noted. Our results have broad implications for the design of AI-powered writing tools, including the need for personalized scaffolding mechanisms.
Paper Structure (41 sections, 3 figures, 10 tables)

This paper contains 41 sections, 3 figures, 10 tables.

Figures (3)

  • Figure 1: User Interaction with our interface: 1) The user writes down a sentence on their own. 2) The user presses the tab key to generate five suggestions but rejects all of them. 3) The user writes more on their own. 4) The user presses the tab key to generate suggestions, but this time selects the last one. 5) The generated text is added to the end of the user’s text, and the user edits it as they see fit.
  • Figure 2: Our custom AI co-writing interface and the three treatment conditions: a) No AI assistance (control), b) Sentence Mode (low scaffolding) and c) Paragraph Mode (high scaffolding).
  • Figure 3: The Custom-build Interface: Pre-task and Post-task Surveys