Table of Contents
Fetching ...

PlayWrite: A Multimodal System for AI Supported Narrative Co-Authoring Through Play in XR

Esen K. Tütüncü, Qian Zhou, Frederik Brudy, George Fitzmaurice, Fraser Anderson

TL;DR

PlayWrite demonstrates an approach for co-creative systems that move beyond text to embrace direct manipulation and play as core interaction modalities and is presented, a mixed-reality system where users author stories by directly manipulating virtual characters and props.

Abstract

Current AI writing tools, which rely on text prompts, poorly support the spatial and interactive nature of storytelling where ideas emerge from direct manipulation and play. We present PlayWrite, a mixed-reality system where users author stories by directly manipulating virtual characters and props. A multi-agent AI pipeline interprets these actions into Intent Frames -structured narrative beats visualized as rearrangeable story marbles on a timeline. A large language model then transforms the user's assembled sequence into a final narrative. A user study (N=13) with writers from varying domains found that PlayWrite fosters a highly improvisational and playful process. Users treated the AI as a collaborative partner, using its unexpected responses to spark new ideas and overcome creative blocks. PlayWrite demonstrates an approach for co-creative systems that move beyond text to embrace direct manipulation and play as core interaction modalities.

PlayWrite: A Multimodal System for AI Supported Narrative Co-Authoring Through Play in XR

TL;DR

PlayWrite demonstrates an approach for co-creative systems that move beyond text to embrace direct manipulation and play as core interaction modalities and is presented, a mixed-reality system where users author stories by directly manipulating virtual characters and props.

Abstract

Current AI writing tools, which rely on text prompts, poorly support the spatial and interactive nature of storytelling where ideas emerge from direct manipulation and play. We present PlayWrite, a mixed-reality system where users author stories by directly manipulating virtual characters and props. A multi-agent AI pipeline interprets these actions into Intent Frames -structured narrative beats visualized as rearrangeable story marbles on a timeline. A large language model then transforms the user's assembled sequence into a final narrative. A user study (N=13) with writers from varying domains found that PlayWrite fosters a highly improvisational and playful process. Users treated the AI as a collaborative partner, using its unexpected responses to spark new ideas and overcome creative blocks. PlayWrite demonstrates an approach for co-creative systems that move beyond text to embrace direct manipulation and play as core interaction modalities.
Paper Structure (52 sections, 1 equation, 7 figures, 2 tables)

This paper contains 52 sections, 1 equation, 7 figures, 2 tables.

Figures (7)

  • Figure 1: PlayWrite Interface. (a) Grab to Talk: picking up a character triggers user's voice input. (b) Drag to Move: repositioning the character's transform handle for moving them in space with character animations. (c) Attach a Prop: objects can be linked to characters to contextualize actions. (d) AI Respond: the system generates dialogue and narrative continuation in response to user actions. (e) Marble Assembly: users arrange story marbles on a timeline, each representing an Intent Frame, to sequence and refine the story. Dialogue and character locations are replayed when a marble is selected, supporting reflection and revisiting.
  • Figure 2: System pipeline of PlayWrite. (1) Multi-agent observation collects user inputs (speech, grabbing, moving) and interprets them through Environmental, Social, and Narrator agents. (2) The Intent Frame Agent fuses these into higher-level features through temporal filtering, contextual prioritization, and cross-agent fusion. (3) Intent frames are assembled into marbles, which can be replayed, reordered, or exported. (4) Story generation produces a coherent story summary and screenplay from the assembled marbles.
  • Figure 3: Examples of semantically meaningful features captured during play and encoded into Intent Frames. (a) Zone of Interest: a character placed within a predefined hiding zone, and therefore, his action and speech will not be noticed by other characters. (b) Movement Trail: the movement of a character is tracked to capture directionality and pacing. (c) Character–Prop Interaction: a gun is attached to signal a threat. Other characters will react to seeing the gun in hand. (d) Character–Character Interaction: relative positioning and engagement between characters are recorded as two characters relaxing together in the pool. (e) Proximity: spatial closeness between the character and a magic carpet indicates plot progression. Together, these features provide structured cues that the system abstracts into Intent Frames for narrative interpretation.
  • Figure 4: Recorded example workflow of Robin Hood extracted from a user's play (P6, open-ended task). Green-labeled boxes show user speeches, and black-labeled boxes show AI responses demonstrating two conversation segments: (first row) Mary pleads with Robin Hood for help. The user, controlling Robin, confronts Lord Pemberton. (second row) The story develops as the user switches characters and overrides AI responses to introduce a plot twist. Complete dialogue and synopsis can be found in Appendix E and F.
  • Figure 5: Three distinct scenes used in the user study. (a) Tutorial Scene: A simple environment depicting an interview between two characters with a desk zone and stage zone, including an Elvis impersonator, a hiring manager, and basic props. This scene was used to introduce participants to the system’s core interactions (grabbing, moving, speaking, attaching props). (b) Aladdin Scene: A desert-themed environment with pond and hide zones, featuring Aladdin, Genie, and Jafar alongside interactive props. Participants were encouraged to improvise and stage interactions using familiar characters. They were presented with an end goal to reach. (c) Robin Hood Scene: A castle environment with an open space zone, featuring Robin Hood, Mary, and Lord Pemberton, along with thematic props. Participants were instructed to use the characters and the props for an open-ended narrative exploration. Together, these three scenes provided varying levels of narrative structure, scaffolding participants from practice to open-ended improvisation.
  • ...and 2 more figures