FilmSceneDesigner: Chaining Set Design for Procedural Film Scene Generation
Zhifeng Xie, Keyi Zhang, Yiye Yan, Yuling Guo, Fan Yang, Jiting Zhou, Mengtian Li
TL;DR
Problem: manual film set design is labor-intensive and time-consuming; Approach: FilmSceneDesigner combines an agent-based chaining framework with a four-stage procedural generation pipeline that maps natural language scene descriptions to structured parameters for floorplan/structure, material assignment, door/window placement, and object layout, integrating assets from SetDepot-Pro; Contributions: formalizes two structure types and precise geometric representations, e.g., $S = [R_1, \\dots, R_n]$, $E_{ij} = [x_{start}, y_{start}, x_{end}, y_{end}]$ (line) and $E_{ij} = [x_{start}, y_{start}, x_{end}, y_{end}, h_{chord}]$ (arc), $A = \\{(r_i, r_j, \\text{relation})\\}$, and $p(o_r) = p(o_a) + \\lambda(d) \\cdot \\vec{v}(s)$, and a hook-based data bridge; Datasets: SetDepot-Pro with 6,862 assets and 733 materials enables semantic retrieval via Sentence-BERT. Findings: GPT-4V-based evaluations and user studies show superior alignment with cinematic goals across layout, material realism, style, and atmosphere, especially in culturally specific scenes. Significance: supports scalable, production-ready previs, construction drawings, and mood boards, improving realism and efficiency in film production workflows.
Abstract
Film set design plays a pivotal role in cinematic storytelling and shaping the visual atmosphere. However, the traditional process depends on expert-driven manual modeling, which is labor-intensive and time-consuming. To address this issue, we introduce FilmSceneDesigner, an automated scene generation system that emulates professional film set design workflow. Given a natural language description, including scene type, historical period, and style, we design an agent-based chaining framework to generate structured parameters aligned with film set design workflow, guided by prompt strategies that ensure parameter accuracy and coherence. On the other hand, we propose a procedural generation pipeline which executes a series of dedicated functions with the structured parameters for floorplan and structure generation, material assignment, door and window placement, and object retrieval and layout, ultimately constructing a complete film scene from scratch. Moreover, to enhance cinematic realism and asset diversity, we construct SetDepot-Pro, a curated dataset of 6,862 film-specific 3D assets and 733 materials. Experimental results and human evaluations demonstrate that our system produces structurally sound scenes with strong cinematic fidelity, supporting downstream tasks such as virtual previs, construction drawing and mood board creation.
