Graph Canvas for Controllable 3D Scene Generation

Libin Liu; Shen Chen; Sen Jia; Jingzhe Shi; Zhongyu Jiang; Can Jin; Wu Zongkai; Jenq-Neng Hwang; Lei Li

Graph Canvas for Controllable 3D Scene Generation

Libin Liu, Shen Chen, Sen Jia, Jingzhe Shi, Zhongyu Jiang, Can Jin, Wu Zongkai, Jenq-Neng Hwang, Lei Li

TL;DR

GraphCanvas3D tackles the rigidity of contemporary 3D scene generation by introducing a graph-based, programmable representation that supports real-time, retraining-free edits and 4D scene evolution. It builds a hierarchical optimization pipeline—edge-, subgraph-, and graph-level—driven by Multimodal Large Language Models (MLLMs) to enforce spatial coherence and semantic consistency from concise prompts. Key contributions include a modular graph framework, real-time dynamic modification, and comprehensive evaluation showing improved usability, flexibility, and adaptability, with code available at the project website. This approach advances spatial intelligence in interactive 3D environments and is poised to enable scalable, configurable 3D/4D scene generation in applications ranging from VR/AR to intelligent robotic manipulation.

Abstract

Spatial intelligence is foundational to AI systems that interact with the physical world, particularly in 3D scene generation and spatial comprehension. Current methodologies for 3D scene generation often rely heavily on predefined datasets, and struggle to adapt dynamically to changing spatial relationships. In this paper, we introduce GraphCanvas3D, a programmable, extensible, and adaptable framework for controllable 3D scene generation. Leveraging in-context learning, GraphCanvas3D enables dynamic adaptability without the need for retraining, supporting flexible and customizable scene creation. Our framework employs hierarchical, graph-driven scene descriptions, representing spatial elements as graph nodes and establishing coherent relationships among objects in 3D environments. Unlike conventional approaches, which are constrained in adaptability and often require predefined input masks or retraining for modifications, GraphCanvas3D allows for seamless object manipulation and scene adjustments on the fly. Additionally, GraphCanvas3D supports 4D scene generation, incorporating temporal dynamics to model changes over time. Experimental results and user studies demonstrate that GraphCanvas3D enhances usability, flexibility, and adaptability for scene generation. Our code and models are available on the project website: https://github.com/ILGLJ/Graph-Canvas.

Graph Canvas for Controllable 3D Scene Generation

TL;DR

Abstract

Graph Canvas for Controllable 3D Scene Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)