Table of Contents
Fetching ...

Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint

Junwei Zhou, Xueting Li, Lu Qi, Ming-Hsuan Yang

TL;DR

Layout-Your-3D addresses the challenge of controllable compositional 3D generation from text by leveraging 2D layouts as blueprints. It combines a fast, reconstruction-based coarse 3D initialization with a two-stage disentangled refinement: collision-aware layout refinement to achieve coherent global layouts and instance-wise refinement to enhance geometry and texture, all while supporting interactive per-object editing and downstream tasks like object insertion. The approach yields higher-quality, more plausible compositional 3D scenes with significantly reduced generation time compared to prior methods, as demonstrated on the Compo20 validation set and through extensive ablations. This work offers practical improvements for rapid 3D asset creation and editing, with potential extensions to scene-level generation and more complex interactions.

Abstract

We present Layout-Your-3D, a framework that allows controllable and compositional 3D generation from text prompts. Existing text-to-3D methods often struggle to generate assets with plausible object interactions or require tedious optimization processes. To address these challenges, our approach leverages 2D layouts as a blueprint to facilitate precise and plausible control over 3D generation. Starting with a 2D layout provided by a user or generated from a text description, we first create a coarse 3D scene using a carefully designed initialization process based on efficient reconstruction models. To enforce coherent global 3D layouts and enhance the quality of instance appearances, we propose a collision-aware layout optimization process followed by instance-wise refinement. Experimental results demonstrate that Layout-Your-3D yields more reasonable and visually appealing compositional 3D assets while significantly reducing the time required for each prompt. Additionally, Layout-Your-3D can be easily applicable to downstream tasks, such as 3D editing and object insertion. Our project page is available at:https://colezwhy.github.io/layoutyour3d/

Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint

TL;DR

Layout-Your-3D addresses the challenge of controllable compositional 3D generation from text by leveraging 2D layouts as blueprints. It combines a fast, reconstruction-based coarse 3D initialization with a two-stage disentangled refinement: collision-aware layout refinement to achieve coherent global layouts and instance-wise refinement to enhance geometry and texture, all while supporting interactive per-object editing and downstream tasks like object insertion. The approach yields higher-quality, more plausible compositional 3D scenes with significantly reduced generation time compared to prior methods, as demonstrated on the Compo20 validation set and through extensive ablations. This work offers practical improvements for rapid 3D asset creation and editing, with potential extensions to scene-level generation and more complex interactions.

Abstract

We present Layout-Your-3D, a framework that allows controllable and compositional 3D generation from text prompts. Existing text-to-3D methods often struggle to generate assets with plausible object interactions or require tedious optimization processes. To address these challenges, our approach leverages 2D layouts as a blueprint to facilitate precise and plausible control over 3D generation. Starting with a 2D layout provided by a user or generated from a text description, we first create a coarse 3D scene using a carefully designed initialization process based on efficient reconstruction models. To enforce coherent global 3D layouts and enhance the quality of instance appearances, we propose a collision-aware layout optimization process followed by instance-wise refinement. Experimental results demonstrate that Layout-Your-3D yields more reasonable and visually appealing compositional 3D assets while significantly reducing the time required for each prompt. Additionally, Layout-Your-3D can be easily applicable to downstream tasks, such as 3D editing and object insertion. Our project page is available at:https://colezwhy.github.io/layoutyour3d/

Paper Structure

This paper contains 31 sections, 12 equations, 21 figures, 9 tables.

Figures (21)

  • Figure 1: Layout-Your-3D generates high-quality compositional 3D scenes with given 2D layouts (top). Layout-Your-3D further enables editing each instance in the 3D scene with custom text prompts, achieving controllable and precise 3D generation (bottom).
  • Figure 2: Overview of Layout-Your-3D. Given a 2D layout and text prompt, our coarse 3D generation stage (green box, see Sec. \ref{['sec:coarse']}) generates coarse 3D instances along with roughly reasonable layouts. The disentangled refinement stage (see Sec. \ref{['sec:refine']}) then refines the 3D layout and enhances individual instance quality by leveraging a collision-aware layout refinement (blue box) followed by an instance-wise refinement (yellow box).
  • Figure 3: Example of the instance segmentation, inpainting and post-process.
  • Figure 4: A simple illustration of how to calculate the collision loss $\mathcal{L}_{col}$.
  • Figure 5: Main results and comparison with other text-to-3D methods on our Compo20 validation set. Layout-Your-3D can generate compositional 3D scenes with higher quality and more reasonable 3D layouts. Note that the first two rows of our results are generated with LLM-grounded 2D layouts, and the last two are generated with user-given 2D layouts.
  • ...and 16 more figures