SceneX: Procedural Controllable Large-scale Scene Generation
Mengqi Zhou, Yuxi Wang, Jun Hou, Shougao Zhang, Yiwei Li, Chuanchen Luo, Junran Peng, Zhaoxiang Zhang
TL;DR
SceneX introduces a two-component framework, PCGHub and PCGPlanner, to enable text-driven, large-scale procedural scene generation with Blender-based execution. PCGHub provides a broad repository of procedural assets and APIs, while PCGPlanner orchestrates tasks through a systematic LLM-prompt template and a hyperparameter generator to produce controllable 3D scenes, including nature and city environments. Extensive experiments demonstrate high-quality, editable outputs and favorable efficiency compared with existing approaches, along with robust cross-LLM performance. The work highlights significant potential to democratize large-scale scene creation for industry pipelines, albeit with dependencies on LLM capabilities and PCG resource breadth.
Abstract
Developing comprehensive explicit world models is crucial for understanding and simulating real-world scenarios. Recently, Procedural Controllable Generation (PCG) has gained significant attention in large-scale scene generation by enabling the creation of scalable, high-quality assets. However, PCG faces challenges such as limited modular diversity, high expertise requirements, and challenges in managing the diverse elements and structures in complex scenes. In this paper, we introduce a large-scale scene generation framework, SceneX, which can automatically produce high-quality procedural models according to designers' textual descriptions. Specifically, the proposed method comprises two components, PCGHub and PCGPlanner. The former encompasses an extensive collection of accessible procedural assets and thousands of hand-craft API documents to perform as a standard protocol for PCG controller. The latter aims to generate executable actions for Blender to produce controllable and precise 3D assets guided by the user's instructions. Extensive experiments demonstrated the capability of our method in controllable large-scale scene generation, including nature scenes and unbounded cities, as well as scene editing such as asset placement and season translation.
