Table of Contents
Fetching ...

Instruction-Driven Game Engines on Large Language Models

Hongqiu Wu, Yan Wang, Xingyuan Liu, Hai Zhao, Min Zhang

TL;DR

IDGE is a neural game engine built on LLMs that follows instruction scripts to autonomously progress turn-based gameplay by predicting next in-game states. It formulates gameplay as Next State Prediction with a curriculum that starts from a poker domain and generalizes to diverse variants; the NSP objective is $\sum_{t=1}^{T}\log p_\theta(s_t|s_{t-1},x_t,z)$, approximated with a dependence on the previous state ($k=1$) to manage long contexts. Training data come from a poker simulator and are balanced across rare states; a three-stage pipeline—Core Set warmup, Standard NSP training, and Diverse Segment Rephrasing—improves stability and linguistic generalization. In-domain experiments show fine-tuned, SR-enabled CodeLLaMA models achieve high state and round success, while out-of-domain scripts require few-shot samples or user-guided continue-training via DPO to reach satisfactory performance. The work suggests a practical path toward rapid, instruction-driven game design with LLM-based engines and prompts a broader exploration across more complex, real-time games.

Abstract

The Instruction-Driven Game Engine (IDGE) project aims to democratize game development by enabling a large language model (LLM) to follow free-form game rules and autonomously generate game-play processes. The IDGE allows users to create games by issuing simple natural language instructions, which significantly lowers the barrier for game development. We approach the learning process for IDGEs as a Next State Prediction task, wherein the model autoregressively predicts in-game states given player actions. It is a challenging task because the computation of in-game states must be precise; otherwise, slight errors could disrupt the game-play. To address this, we train the IDGE in a curriculum manner that progressively increases the model's exposure to complex scenarios. Our initial progress lies in developing an IDGE for Poker, a universally cherished card game. The engine we've designed not only supports a wide range of poker variants but also allows for high customization of rules through natural language inputs. Furthermore, it also favors rapid prototyping of new games from minimal samples, proposing an innovative paradigm in game development that relies on minimal prompt and data engineering. This work lays the groundwork for future advancements in instruction-driven game creation, potentially transforming how games are designed and played.

Instruction-Driven Game Engines on Large Language Models

TL;DR

IDGE is a neural game engine built on LLMs that follows instruction scripts to autonomously progress turn-based gameplay by predicting next in-game states. It formulates gameplay as Next State Prediction with a curriculum that starts from a poker domain and generalizes to diverse variants; the NSP objective is , approximated with a dependence on the previous state () to manage long contexts. Training data come from a poker simulator and are balanced across rare states; a three-stage pipeline—Core Set warmup, Standard NSP training, and Diverse Segment Rephrasing—improves stability and linguistic generalization. In-domain experiments show fine-tuned, SR-enabled CodeLLaMA models achieve high state and round success, while out-of-domain scripts require few-shot samples or user-guided continue-training via DPO to reach satisfactory performance. The work suggests a practical path toward rapid, instruction-driven game design with LLM-based engines and prompts a broader exploration across more complex, real-time games.

Abstract

The Instruction-Driven Game Engine (IDGE) project aims to democratize game development by enabling a large language model (LLM) to follow free-form game rules and autonomously generate game-play processes. The IDGE allows users to create games by issuing simple natural language instructions, which significantly lowers the barrier for game development. We approach the learning process for IDGEs as a Next State Prediction task, wherein the model autoregressively predicts in-game states given player actions. It is a challenging task because the computation of in-game states must be precise; otherwise, slight errors could disrupt the game-play. To address this, we train the IDGE in a curriculum manner that progressively increases the model's exposure to complex scenarios. Our initial progress lies in developing an IDGE for Poker, a universally cherished card game. The engine we've designed not only supports a wide range of poker variants but also allows for high customization of rules through natural language inputs. Furthermore, it also favors rapid prototyping of new games from minimal samples, proposing an innovative paradigm in game development that relies on minimal prompt and data engineering. This work lays the groundwork for future advancements in instruction-driven game creation, potentially transforming how games are designed and played.
Paper Structure (20 sections, 1 equation, 4 figures, 7 tables)

This paper contains 20 sections, 1 equation, 4 figures, 7 tables.

Figures (4)

  • Figure 1: 1: Players were tired against the game's protagonist models. 2, 3: Developers thus created a new mode with dual protagonists. Players still didn't buy it, while they didn't know how to develop games. 4: There were irreconcilable divides between players and developers. 5, 6: Till the advent of the IDGE, it can read the players' mind and let them experience the games immediately.
  • Figure 2: Game-style samples for next state prediction. The left side is the input text for the engine from a global view, including all parts that are visible to players as well as those that are not. The right side is the diagram of the game from different views.
  • Figure 3: Round-level success rates on variants of 10 prototype poker games. In addition, our experiment turns out both GPT3.5 and GPT4 only achieve a zero success rate on all these games.
  • Figure 4: Balance the data distribution to boost data efficiency when sampling training data.