Table of Contents
Fetching ...

Word2Minecraft: Generating 3D Game Levels through Large Language Models

Shuo Huang, Muhammad Umair Nasir, Steven James, Julian Togelius

TL;DR

Word2Minecraft presents a pipeline that translates structured stories into playable Minecraft levels by leveraging large language models for narrative generation, 2D map construction, and Minecraft translation. A novel adaptive tile-scaling mechanism preserves spatial realism, while sub-map generation enables objective-driven variety and scalable complexity. Comparative evaluations reveal GPT-4-Turbo delivers higher story coherence, diversity, and functional gameplay, whereas GPT-4o-Mini yields stronger aesthetics, with an evolutionary baseline (EA) achieving the highest overall enjoyment. The work advances story-driven procedural content generation in 3D environments and demonstrates open-source code to foster future research and practical deployment.

Abstract

We present Word2Minecraft, a system that leverages large language models to generate playable game levels in Minecraft based on structured stories. The system transforms narrative elements-such as protagonist goals, antagonist challenges, and environmental settings-into game levels with both spatial and gameplay constraints. We introduce a flexible framework that allows for the customization of story complexity, enabling dynamic level generation. The system employs a scaling algorithm to maintain spatial consistency while adapting key game elements. We evaluate Word2Minecraft using both metric-based and human-based methods. Our results show that GPT-4-Turbo outperforms GPT-4o-Mini in most areas, including story coherence and objective enjoyment, while the latter excels in aesthetic appeal. We also demonstrate the system' s ability to generate levels with high map enjoyment, offering a promising step forward in the intersection of story generation and game design. We open-source the code at https://github.com/JMZ-kk/Word2Minecraft/tree/word2mc_v0

Word2Minecraft: Generating 3D Game Levels through Large Language Models

TL;DR

Word2Minecraft presents a pipeline that translates structured stories into playable Minecraft levels by leveraging large language models for narrative generation, 2D map construction, and Minecraft translation. A novel adaptive tile-scaling mechanism preserves spatial realism, while sub-map generation enables objective-driven variety and scalable complexity. Comparative evaluations reveal GPT-4-Turbo delivers higher story coherence, diversity, and functional gameplay, whereas GPT-4o-Mini yields stronger aesthetics, with an evolutionary baseline (EA) achieving the highest overall enjoyment. The work advances story-driven procedural content generation in 3D environments and demonstrates open-source code to foster future research and practical deployment.

Abstract

We present Word2Minecraft, a system that leverages large language models to generate playable game levels in Minecraft based on structured stories. The system transforms narrative elements-such as protagonist goals, antagonist challenges, and environmental settings-into game levels with both spatial and gameplay constraints. We introduce a flexible framework that allows for the customization of story complexity, enabling dynamic level generation. The system employs a scaling algorithm to maintain spatial consistency while adapting key game elements. We evaluate Word2Minecraft using both metric-based and human-based methods. Our results show that GPT-4-Turbo outperforms GPT-4o-Mini in most areas, including story coherence and objective enjoyment, while the latter excels in aesthetic appeal. We also demonstrate the system' s ability to generate levels with high map enjoyment, offering a promising step forward in the intersection of story generation and game design. We open-source the code at https://github.com/JMZ-kk/Word2Minecraft/tree/word2mc_v0

Paper Structure

This paper contains 30 sections, 5 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: A flowchart depicting the complete Word2Minecraft pipeline. The LLM is used at every stage of the process.
  • Figure 2: Tile type array representation: 0 represents walkable tiles, 1 represents unwalkable tiles, 2 represents objective-related tiles, 3 represents tiles need to be scaled, and 4 represents already scaled tiles. This map will be updated by Algorithm \ref{['scaling_ag']}.
  • Figure 3: LLM-generated buildings, each expanded from a tile labelled "Illusory Object Tile."
  • Figure 4: Examples of objective-oriented quests.
  • Figure 5: Reconstructed story-based similarity analysis
  • ...and 1 more figures