Position: Interactive Generative Video as Next-Generation Game Engine
Jiwen Yu, Yiran Qin, Haoxuan Che, Quande Liu, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu
TL;DR
This work argues that Interactive Generative Video (IGV) can underpin next-generation Generative Game Engines (GGE) by unifying content generation, physics-aware world modeling, user interactivity, memory, and causal intelligence. It proposes a six-module IGV framework (Generation, Control, Memory, Dynamics, Intelligence, Gameplay) and a five-level maturity roadmap (L0-L4) to guide development toward physics-compliant, causally coherent, self-evolving virtual worlds. The authors emphasize generalization to open-domain games, scalable learning from abundant video data, and the potential to dramatically reduce development costs while expanding creative freedom. They also discuss alternative viewpoints and ethical considerations, arguing that the transformative benefits justify pursuing IGV-based GGEs.
Abstract
Modern game development faces significant challenges in creativity and cost due to predetermined content in traditional game engines. Recent breakthroughs in video generation models, capable of synthesizing realistic and interactive virtual environments, present an opportunity to revolutionize game creation. In this position paper, we propose Interactive Generative Video (IGV) as the foundation for Generative Game Engines (GGE), enabling unlimited novel content generation in next-generation gaming. GGE leverages IGV's unique strengths in unlimited high-quality content synthesis, physics-aware world modeling, user-controlled interactivity, long-term memory capabilities, and causal reasoning. We present a comprehensive framework detailing GGE's core modules and a hierarchical maturity roadmap (L0-L4) to guide its evolution. Our work charts a new course for game development in the AI era, envisioning a future where AI-powered generative systems fundamentally reshape how games are created and experienced.
