Table of Contents
Fetching ...

Symbolically Scaffolded Play: Designing Role-Sensitive Prompts for Generative NPC Dialogue

Vanessa Figueiredo, David Elumeze

TL;DR

The paper interrogates whether increasing prompt constraint enhances player experience in generative NPC dialogue. Using a within-subject usability study of a voice-based detective game (The Interview) and a subsequent JSON+RAG redesign evaluated with a synthetic LLM judge, the authors find that tighter prompts do not reliably improve first-play experience and that scaffolding effects are role-dependent: rigid constraints stabilize the Interviewer but can dampen suspects' improvisational believability. They introduce Symbolically Scaffolded Play, a framework that expresses symbolic structures as fuzzy, numerical boundaries to stabilize coherence where needed while preserving spontaneity elsewhere. Methodologically, the work advocates a hybrid evaluation approach—combining human usability studies with synthetic probes—to reveal which scaffolds meaningfully affect player perception. The findings challenge the assumption that more constrained prompts always improve play and offer practical guidance for designing role-sensitive, modular scaffolds that balance reliability and improvisation in generative play systems.

Abstract

Large Language Models (LLMs) promise to transform interactive games by enabling non-player characters (NPCs) to sustain unscripted dialogue. Yet it remains unclear whether constrained prompts actually improve player experience. We investigate this question through The Interview, a voice-based detective game powered by GPT-4o. A within-subjects usability study ($N=10$) compared high-constraint (HCP) and low-constraint (LCP) prompts, revealing no reliable experiential differences beyond sensitivity to technical breakdowns. Guided by these findings, we redesigned the HCP into a hybrid JSON+RAG scaffold and conducted a synthetic evaluation with an LLM judge, positioned as an early-stage complement to usability testing. Results uncovered a novel pattern: scaffolding effects were role-dependent: the Interviewer (quest-giver NPC) gained stability, while suspect NPCs lost improvisational believability. These findings overturn the assumption that tighter constraints inherently enhance play. Extending fuzzy-symbolic scaffolding, we introduce \textit{Symbolically Scaffolded Play}, a framework in which symbolic structures are expressed as fuzzy, numerical boundaries that stabilize coherence where needed while preserving improvisation where surprise sustains engagement.

Symbolically Scaffolded Play: Designing Role-Sensitive Prompts for Generative NPC Dialogue

TL;DR

The paper interrogates whether increasing prompt constraint enhances player experience in generative NPC dialogue. Using a within-subject usability study of a voice-based detective game (The Interview) and a subsequent JSON+RAG redesign evaluated with a synthetic LLM judge, the authors find that tighter prompts do not reliably improve first-play experience and that scaffolding effects are role-dependent: rigid constraints stabilize the Interviewer but can dampen suspects' improvisational believability. They introduce Symbolically Scaffolded Play, a framework that expresses symbolic structures as fuzzy, numerical boundaries to stabilize coherence where needed while preserving spontaneity elsewhere. Methodologically, the work advocates a hybrid evaluation approach—combining human usability studies with synthetic probes—to reveal which scaffolds meaningfully affect player perception. The findings challenge the assumption that more constrained prompts always improve play and offer practical guidance for designing role-sensitive, modular scaffolds that balance reliability and improvisation in generative play systems.

Abstract

Large Language Models (LLMs) promise to transform interactive games by enabling non-player characters (NPCs) to sustain unscripted dialogue. Yet it remains unclear whether constrained prompts actually improve player experience. We investigate this question through The Interview, a voice-based detective game powered by GPT-4o. A within-subjects usability study () compared high-constraint (HCP) and low-constraint (LCP) prompts, revealing no reliable experiential differences beyond sensitivity to technical breakdowns. Guided by these findings, we redesigned the HCP into a hybrid JSON+RAG scaffold and conducted a synthetic evaluation with an LLM judge, positioned as an early-stage complement to usability testing. Results uncovered a novel pattern: scaffolding effects were role-dependent: the Interviewer (quest-giver NPC) gained stability, while suspect NPCs lost improvisational believability. These findings overturn the assumption that tighter constraints inherently enhance play. Extending fuzzy-symbolic scaffolding, we introduce \textit{Symbolically Scaffolded Play}, a framework in which symbolic structures are expressed as fuzzy, numerical boundaries that stabilize coherence where needed while preserving improvisation where surprise sustains engagement.

Paper Structure

This paper contains 44 sections, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Our prototype, The Interview, demonstrates how role-sensitive scaffolding of LLM prompts can balance coherence and improvisation in NPC dialogue, offering a research probe into Symbolically Scaffolded Play.
  • Figure 2: Workflow of the JSON+RAG prompting architecture. Player input is processed through a retrieval pipeline that searches lore and dialogue history, while a character engine selects relevant traits and rules from JSON schemas. Both streams are merged into a structured prompt template, balancing improvisation with symbolic coherence. The workflow illustrates how JSON schemas and retrieval augmentations jointly scaffold NPC behavior, enabling systematic role-sensitive comparisons.