Table of Contents
Fetching ...

LayoutFlow: Flow Matching for Layout Generation

Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui, Mayu Otani, Hideki Nakayama

TL;DR

LayoutFlow applies Conditional Flow Matching to layout generation, formulating a flow between a simple base distribution and complex layouts via the ODE $\frac{d}{dt} \phi_t(x) = v_t(\phi_t(x))$, with $\phi_0(x)=x_0$. It represents layouts as sets of elements with continuous geometry and embedded categorical attributes, learned through a Transformer-based vector-field predictor trained on linear trajectories and augmented by an $L_1$ regularization on the geometry output. The method supports unconditional and various conditioned generation tasks through a conditioning mechanism, achieving state-of-the-art or competitive results with significantly faster inference than diffusion-based models. These results demonstrate that Flow Matching offers a flexible, efficient alternative for layout design, with potential extensions to trajectory selection and content-aware layouts for practical design workflows.

Abstract

Finding a suitable layout represents a crucial task for diverse applications in graphic design. Motivated by simpler and smoother sampling trajectories, we explore the use of Flow Matching as an alternative to current diffusion-based layout generation models. Specifically, we propose LayoutFlow, an efficient flow-based model capable of generating high-quality layouts. Instead of progressively denoising the elements of a noisy layout, our method learns to gradually move, or flow, the elements of an initial sample until it reaches its final prediction. In addition, we employ a conditioning scheme that allows us to handle various generation tasks with varying degrees of conditioning with a single model. Empirically, LayoutFlow performs on par with state-of-the-art models while being significantly faster.

LayoutFlow: Flow Matching for Layout Generation

TL;DR

LayoutFlow applies Conditional Flow Matching to layout generation, formulating a flow between a simple base distribution and complex layouts via the ODE , with . It represents layouts as sets of elements with continuous geometry and embedded categorical attributes, learned through a Transformer-based vector-field predictor trained on linear trajectories and augmented by an regularization on the geometry output. The method supports unconditional and various conditioned generation tasks through a conditioning mechanism, achieving state-of-the-art or competitive results with significantly faster inference than diffusion-based models. These results demonstrate that Flow Matching offers a flexible, efficient alternative for layout design, with potential extensions to trajectory selection and content-aware layouts for practical design workflows.

Abstract

Finding a suitable layout represents a crucial task for diverse applications in graphic design. Motivated by simpler and smoother sampling trajectories, we explore the use of Flow Matching as an alternative to current diffusion-based layout generation models. Specifically, we propose LayoutFlow, an efficient flow-based model capable of generating high-quality layouts. Instead of progressively denoising the elements of a noisy layout, our method learns to gradually move, or flow, the elements of an initial sample until it reaches its final prediction. In addition, we employ a conditioning scheme that allows us to handle various generation tasks with varying degrees of conditioning with a single model. Empirically, LayoutFlow performs on par with state-of-the-art models while being significantly faster.
Paper Structure (24 sections, 9 equations, 8 figures, 20 tables, 1 algorithm)

This paper contains 24 sections, 9 equations, 8 figures, 20 tables, 1 algorithm.

Figures (8)

  • Figure 1: Comparison of different layout generation trajectories. Given a randomly initialized layout at time $t=0$ with fixed element sizes, we visualize different states of the generation process until the final layout at $t=1$. In addition, we overlay the trajectory, which can be interpreted as the movement of the elements over time, on top of the final layout. A circle marks the location of the initial sample and a triangle marks the final location. Flow Matching produces smooth and directed paths, whereas both diffusion models slowly converge to the final prediction under noisy trajectories with a long path length. As a result, flow-based models require fewer evaluation steps than diffusion, leading to faster sampling.
  • Figure 2: Overview of the training procedure of LayoutFlow for the type-conditioned scenario. First, we sample an initial layout from a base distribution and a time $t$. Then, an intermediate sample $\mathbf{g}_t$ is calculated by linearly interpolating between the initial sample and the ground truth layout. Each intermediate element is embedded jointly with the given element condition $\Tilde{\mathbf{a}}^k$. Lastly, the Transformer architecture takes all the element embeddings to predict a vector field.
  • Figure 3: Overview of the type-conditioned inference process. The initial sample is autoregressively moved in the predicted direction.
  • Figure 4: Condition masks. Black indicates parts given as conditions.
  • Figure 5: Quality-Speed Comparison
  • ...and 3 more figures