Table of Contents
Fetching ...

Agents' Room: Narrative Generation through Multi-step Collaboration

Fantine Huot, Reinald Kim Amplayo, Jennimaria Palomaki, Alice Shoshana Jakobovits, Elizabeth Clark, Mirella Lapata

TL;DR

The paper tackles long-form fiction generation by decomposing the task into a multi-agent system (planning and writing agents) connected via a scratchpad and a centralized orchestrator. It formalizes Agents' Room, introduces Tell Me a Story as a high-quality prompts-and-stories dataset, and develops a novel evaluation framework combining human judgments and LLM-based assessments. Empirical results show that Agents' Room, especially with writing agents and fine-tuned components, yields stories that outperform end-to-end baselines across multiple quality dimensions and align well with human judgments. The work advances scalable, modular narrative generation with robust evaluation and provides reproducibility resources for further research.

Abstract

Writing compelling fiction is a multifaceted process combining elements such as crafting a plot, developing interesting characters, and using evocative language. While large language models (LLMs) show promise for story writing, they currently rely heavily on intricate prompting, which limits their use. We propose Agents' Room, a generation framework inspired by narrative theory, that decomposes narrative writing into subtasks tackled by specialized agents. To illustrate our method, we introduce Tell Me A Story, a high-quality dataset of complex writing prompts and human-written stories, and a novel evaluation framework designed specifically for assessing long narratives. We show that Agents' Room generates stories that are preferred by expert evaluators over those produced by baseline systems by leveraging collaboration and specialization to decompose the complex story writing task into tractable components. We provide extensive analysis with automated and human-based metrics of the generated output.

Agents' Room: Narrative Generation through Multi-step Collaboration

TL;DR

The paper tackles long-form fiction generation by decomposing the task into a multi-agent system (planning and writing agents) connected via a scratchpad and a centralized orchestrator. It formalizes Agents' Room, introduces Tell Me a Story as a high-quality prompts-and-stories dataset, and develops a novel evaluation framework combining human judgments and LLM-based assessments. Empirical results show that Agents' Room, especially with writing agents and fine-tuned components, yields stories that outperform end-to-end baselines across multiple quality dimensions and align well with human judgments. The work advances scalable, modular narrative generation with robust evaluation and provides reproducibility resources for further research.

Abstract

Writing compelling fiction is a multifaceted process combining elements such as crafting a plot, developing interesting characters, and using evocative language. While large language models (LLMs) show promise for story writing, they currently rely heavily on intricate prompting, which limits their use. We propose Agents' Room, a generation framework inspired by narrative theory, that decomposes narrative writing into subtasks tackled by specialized agents. To illustrate our method, we introduce Tell Me A Story, a high-quality dataset of complex writing prompts and human-written stories, and a novel evaluation framework designed specifically for assessing long narratives. We show that Agents' Room generates stories that are preferred by expert evaluators over those produced by baseline systems by leveraging collaboration and specialization to decompose the complex story writing task into tractable components. We provide extensive analysis with automated and human-based metrics of the generated output.
Paper Structure (25 sections, 3 figures, 4 tables, 1 algorithm)

This paper contains 25 sections, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Agents' Room, a multi-agent framework for collaborative writing. A central orchestrator calls the individual agents and consolidates their contributions into the scratchpad. We color-code each piece of the scratchpad with the contributing agent's color.
  • Figure 2: Prompts from the Tell$\,$me$\,$a$\,$story dataset (corresponding stories are in \ref{['appendix:fictionstories']}).
  • Figure 3: Overall system ranking across dimensions of plot, creativity, development, and language, according to human ratings (a) and a LLM-based evaluator (b).