Table of Contents
Fetching ...

Word2World: Generating Stories and Worlds through Large Language Models

Muhammad U. Nasir, Steven James, Julian Togelius

TL;DR

Word2World presents a zero-shot, LLM-driven pipeline that turns stories into narrative elements and playable 2D worlds by iteratively extracting characters, tiles, and goals, and constructing the world in two steps with feedback loops. It combines DistilBERT-based tile retrieval from curated datasets, A*-style playability checks, and an LLM agent that traverses objectives to refine the design, achieving high coherence and around 90% playability across tested models. The study provides extensive ablations and cross-model evaluations to demonstrate the necessity of each pipeline component, and highlights the system's potential as narrative-to-level tooling and reinforcement learning environment generator. The approach advances procedural content generation by enabling end-to-end, narrative-consistent world creation without task-specific fine-tuning, with practical implications for game design and AI research on open-ended environments.

Abstract

Large Language Models (LLMs) have proven their worth across a diverse spectrum of disciplines. LLMs have shown great potential in Procedural Content Generation (PCG) as well, but directly generating a level through a pre-trained LLM is still challenging. This work introduces Word2World, a system that enables LLMs to procedurally design playable games through stories, without any task-specific fine-tuning. Word2World leverages the abilities of LLMs to create diverse content and extract information. Combining these abilities, LLMs can create a story for the game, design narrative, and place tiles in appropriate places to create coherent worlds and playable games. We test Word2World with different LLMs and perform a thorough ablation study to validate each step. We open-source the code at https://github.com/umair-nasir14/Word2World.

Word2World: Generating Stories and Worlds through Large Language Models

TL;DR

Word2World presents a zero-shot, LLM-driven pipeline that turns stories into narrative elements and playable 2D worlds by iteratively extracting characters, tiles, and goals, and constructing the world in two steps with feedback loops. It combines DistilBERT-based tile retrieval from curated datasets, A*-style playability checks, and an LLM agent that traverses objectives to refine the design, achieving high coherence and around 90% playability across tested models. The study provides extensive ablations and cross-model evaluations to demonstrate the necessity of each pipeline component, and highlights the system's potential as narrative-to-level tooling and reinforcement learning environment generator. The approach advances procedural content generation by enabling end-to-end, narrative-consistent world creation without task-specific fine-tuning, with practical implications for game design and AI research on open-ended environments.

Abstract

Large Language Models (LLMs) have proven their worth across a diverse spectrum of disciplines. LLMs have shown great potential in Procedural Content Generation (PCG) as well, but directly generating a level through a pre-trained LLM is still challenging. This work introduces Word2World, a system that enables LLMs to procedurally design playable games through stories, without any task-specific fine-tuning. Word2World leverages the abilities of LLMs to create diverse content and extract information. Combining these abilities, LLMs can create a story for the game, design narrative, and place tiles in appropriate places to create coherent worlds and playable games. We test Word2World with different LLMs and perform a thorough ablation study to validate each step. We open-source the code at https://github.com/umair-nasir14/Word2World.
Paper Structure (14 sections, 9 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: A flowchart of the whole pipeline for Word2World. An LLM is required for each step. Once all the required information is extracted, we move to a feedback loop called rounds. This loop makes sure the end result is a playable game that is coherent with the story.
  • Figure 2: An illustration of tiles from Oryx Design Lab.
  • Figure 3: Illustration of how an LLM agent works. LLM agent receives the world, its position, the position of the objective, previous action sequence, and reward for the previous objective. This loop continues till the agent has generated action sequences for all objectives.
  • Figure 4: An illustration of an example of the complete pipeline. Story is provided in each step of information extraction. All previous information is provided when a piece of information is being extracted, for example, to extract walkable tiles, we pass story, tile mapping, goals, and character information.
  • Figure 5: A comparison of ablation studies with respect to (a) Coherence, (b) Playability, (c) Accuracy of character tiles, and (d) Accuracy of important tiles. From left to right in each image the bars represent Word2World, one-round-generation, one-step-generation, direct-generation, no-goals-extracted and No-important-tiles-extracted.
  • ...and 4 more figures