Table of Contents
Fetching ...

Imperative vs. Declarative Programming Paradigms for Open-Universe Scene Generation

Maxim Gumin, Do Heon Han, Seung Jean Yoo, Aditya Ganeshan, R. Kenny Jones, Rio Aguina-Kang, Stewart Morris, Daniel Ritchie

TL;DR

This work challenges the prevailing declarative paradigm for open-universe 3D scene generation by proposing an imperative, LLM-driven program synthesis approach that places objects sequentially and relative to existing ones. An LLM-free error correction mechanism refines the generated programs through low-dimensional parameter updates, enhancing robustness without additional LLM calls. Through perceptual studies, the imperative method is shown to be preferred over strong declarative baselines, and a new automated evaluation metric aligns closely with human judgments. The paper provides a comprehensive evaluation protocol, analyzes the trade-offs between paradigms, discusses limitations, and outlines future directions including dynamic scenes and LLM-compatible DSL design.

Abstract

Current methods for generating 3D scene layouts from text predominantly follow a declarative paradigm, where a Large Language Model (LLM) specifies high-level constraints that are then resolved by a separate solver. This paper challenges that consensus by introducing a more direct, imperative approach. We task an LLM with generating a step-by-step program that iteratively places each object relative to those already in the scene. This paradigm simplifies the underlying scene specification language, enabling the creation of more complex, varied, and highly structured layouts that are difficult to express declaratively. To improve the robustness, we complement our method with a novel, LLM-free error correction mechanism that operates directly on the generated code, iteratively adjusting parameters within the program to resolve collisions and other inconsistencies. In forced-choice perceptual studies, human participants overwhelmingly preferred our imperative layouts, choosing them over those from two state-of-the-art declarative systems 82% and 94% of the time, demonstrating the significant potential of this alternative paradigm. Finally, we present a simple automated evaluation metric for 3D scene layout generation that correlates strongly with human judgment.

Imperative vs. Declarative Programming Paradigms for Open-Universe Scene Generation

TL;DR

This work challenges the prevailing declarative paradigm for open-universe 3D scene generation by proposing an imperative, LLM-driven program synthesis approach that places objects sequentially and relative to existing ones. An LLM-free error correction mechanism refines the generated programs through low-dimensional parameter updates, enhancing robustness without additional LLM calls. Through perceptual studies, the imperative method is shown to be preferred over strong declarative baselines, and a new automated evaluation metric aligns closely with human judgments. The paper provides a comprehensive evaluation protocol, analyzes the trade-offs between paradigms, discusses limitations, and outlines future directions including dynamic scenes and LLM-compatible DSL design.

Abstract

Current methods for generating 3D scene layouts from text predominantly follow a declarative paradigm, where a Large Language Model (LLM) specifies high-level constraints that are then resolved by a separate solver. This paper challenges that consensus by introducing a more direct, imperative approach. We task an LLM with generating a step-by-step program that iteratively places each object relative to those already in the scene. This paradigm simplifies the underlying scene specification language, enabling the creation of more complex, varied, and highly structured layouts that are difficult to express declaratively. To improve the robustness, we complement our method with a novel, LLM-free error correction mechanism that operates directly on the generated code, iteratively adjusting parameters within the program to resolve collisions and other inconsistencies. In forced-choice perceptual studies, human participants overwhelmingly preferred our imperative layouts, choosing them over those from two state-of-the-art declarative systems 82% and 94% of the time, demonstrating the significant potential of this alternative paradigm. Finally, we present a simple automated evaluation metric for 3D scene layout generation that correlates strongly with human judgment.

Paper Structure

This paper contains 24 sections, 1 equation, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Comparison of the commonly used declarative paradigm (right) and our proposed imperative paradigm (left) for generating a "Garage" scene layout. The imperative paradigm explicitly specifies geometric relationships between objects, enabling flexible and precise arrangements.
  • Figure 2: Our scene synthesis pipeline. An LLM first converts an input text description into a scene template (scene dimensions and list of objects). Then, a layout generation stage determines the positions and orientations of those objects using either the declarative or imperative paradigm. Finally, an optional object retrieval stage determines 3D meshes for each object in the scene. Regardless of the choice of layout generation method in blue, all other stages not shown in blue are kept fixed for fair comparison between layout methods.
  • Figure 3: Qualitative comparisons between our method, DeclBase, and Holodeck. Our method and Holodeck uses gpt-4o, while DeclBase uses claude-3-5-sonnet-20241022. See the supplemental for a comparison between our method and DeclBase only using claude-3-5-sonnet-20241022.
  • Figure 4: More scenes synthesized using our imperative layout generation method with error correction.