Table of Contents
Fetching ...

Position: Interactive Generative Video as Next-Generation Game Engine

Jiwen Yu, Yiran Qin, Haoxuan Che, Quande Liu, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu

TL;DR

This work argues that Interactive Generative Video (IGV) can underpin next-generation Generative Game Engines (GGE) by unifying content generation, physics-aware world modeling, user interactivity, memory, and causal intelligence. It proposes a six-module IGV framework (Generation, Control, Memory, Dynamics, Intelligence, Gameplay) and a five-level maturity roadmap (L0-L4) to guide development toward physics-compliant, causally coherent, self-evolving virtual worlds. The authors emphasize generalization to open-domain games, scalable learning from abundant video data, and the potential to dramatically reduce development costs while expanding creative freedom. They also discuss alternative viewpoints and ethical considerations, arguing that the transformative benefits justify pursuing IGV-based GGEs.

Abstract

Modern game development faces significant challenges in creativity and cost due to predetermined content in traditional game engines. Recent breakthroughs in video generation models, capable of synthesizing realistic and interactive virtual environments, present an opportunity to revolutionize game creation. In this position paper, we propose Interactive Generative Video (IGV) as the foundation for Generative Game Engines (GGE), enabling unlimited novel content generation in next-generation gaming. GGE leverages IGV's unique strengths in unlimited high-quality content synthesis, physics-aware world modeling, user-controlled interactivity, long-term memory capabilities, and causal reasoning. We present a comprehensive framework detailing GGE's core modules and a hierarchical maturity roadmap (L0-L4) to guide its evolution. Our work charts a new course for game development in the AI era, envisioning a future where AI-powered generative systems fundamentally reshape how games are created and experienced.

Position: Interactive Generative Video as Next-Generation Game Engine

TL;DR

This work argues that Interactive Generative Video (IGV) can underpin next-generation Generative Game Engines (GGE) by unifying content generation, physics-aware world modeling, user interactivity, memory, and causal intelligence. It proposes a six-module IGV framework (Generation, Control, Memory, Dynamics, Intelligence, Gameplay) and a five-level maturity roadmap (L0-L4) to guide development toward physics-compliant, causally coherent, self-evolving virtual worlds. The authors emphasize generalization to open-domain games, scalable learning from abundant video data, and the potential to dramatically reduce development costs while expanding creative freedom. They also discuss alternative viewpoints and ethical considerations, arguing that the transformative benefits justify pursuing IGV-based GGEs.

Abstract

Modern game development faces significant challenges in creativity and cost due to predetermined content in traditional game engines. Recent breakthroughs in video generation models, capable of synthesizing realistic and interactive virtual environments, present an opportunity to revolutionize game creation. In this position paper, we propose Interactive Generative Video (IGV) as the foundation for Generative Game Engines (GGE), enabling unlimited novel content generation in next-generation gaming. GGE leverages IGV's unique strengths in unlimited high-quality content synthesis, physics-aware world modeling, user-controlled interactivity, long-term memory capabilities, and causal reasoning. We present a comprehensive framework detailing GGE's core modules and a hierarchical maturity roadmap (L0-L4) to guide its evolution. Our work charts a new course for game development in the AI era, envisioning a future where AI-powered generative systems fundamentally reshape how games are created and experienced.

Paper Structure

This paper contains 25 sections, 8 figures, 1 table.

Figures (8)

  • Figure 1: Demonstration of GameFactory gamefactory's ability to generalize action control abilities learned from Minecraft data to open-domain scenarios.
  • Figure 2: Physics-aware generation capabilities of video models. Top: Examples from Cosmos cosmos demonstrating physical understanding in diverse scenarios including robotics, autonomous driving, manufacturing, and home environments. Bottom: Human motion examples generated by Kling kling.
  • Figure 3: GameNGen gamengine shows interactive gameplay in generated videos.
  • Figure 4: Proposed framework of Generative Game Engine (GGE). (a) Architecture and interactions between modules of GGE. (b) Technical keywords and their explanation of each module.
  • Figure 5: The Control module manages player control through two aspect: navigation and interaction control.
  • ...and 3 more figures