Relation-Aware Diffusion Model for Controllable Poster Layout Generation

Fengheng Li; An Liu; Wei Feng; Honghe Zhu; Yaoyu Li; Zheng Zhang; Jingjing Lv; Xin Zhu; Junjie Shen; Zhangang Lin; Jingping Shao

Relation-Aware Diffusion Model for Controllable Poster Layout Generation

Fengheng Li, An Liu, Wei Feng, Honghe Zhu, Yaoyu Li, Zheng Zhang, Jingjing Lv, Xin Zhu, Junjie Shen, Zhangang Lin, Jingping Shao

TL;DR

This work tackles poster layout generation by explicitly modeling the interplay between visual content, textual information, and geometric relations using a diffusion-based framework. It introduces two modules, VTRAM and GRAM, to align text with visual regions and to encode relative spatial relationships among elements, respectively, enabling controllable and content-aware layouts. A new dataset, CGL-Dataset V2, with text annotations, supports robust training and evaluation, and experiments show clear improvements over both content-aware and content-agnostic baselines in user studies and composition-quality metrics. The approach promises practical impact for automatic, high-quality poster design with user-guided controllability.

Abstract

Poster layout is a crucial aspect of poster design. Prior methods primarily focus on the correlation between visual content and graphic elements. However, a pleasant layout should also consider the relationship between visual and textual contents and the relationship between elements. In this study, we introduce a relation-aware diffusion model for poster layout generation that incorporates these two relationships in the generation process. Firstly, we devise a visual-textual relation-aware module that aligns the visual and textual representations across modalities, thereby enhancing the layout's efficacy in conveying textual information. Subsequently, we propose a geometry relation-aware module that learns the geometry relationship between elements by comprehensively considering contextual information. Additionally, the proposed method can generate diverse layouts based on user constraints. To advance research in this field, we have constructed a poster layout dataset named CGL-Dataset V2. Our proposed method outperforms state-of-the-art methods on CGL-Dataset V2. The data and code will be available at https://github.com/liuan0803/RADM.

Relation-Aware Diffusion Model for Controllable Poster Layout Generation

TL;DR

Abstract

Paper Structure (23 sections, 12 equations, 11 figures, 4 tables)

This paper contains 23 sections, 12 equations, 11 figures, 4 tables.

Introduction
Related Work
Layout Generation
Diffusion Models
CGL-Dataset V2
Method
Poster Layout Generation with Diffusion Model
Diffusion Process
Denoise Process
Feature Extractor
Image Encoder
Text Encoder
Visual-Textual Relation-Aware Module
Geometry Relation-Aware Module
Layout Decoder
...and 8 more sections

Figures (11)

Figure 1: The visual examples of poster layout produced by CGL-GANZhou2022CompositionawareGL and ours.
Figure 2: (a) Poster layout annotation. Different colors represent different element types, the text annotation results are in the gray box, and the English translation is in brackets; (b) Clean image; (c) Input for inference stage.
Figure 3: The overview of our method, which contains four parts: feature extractor, VTRAM, GRAM and layout decoder.
Figure 4: Inspired by diffusion denoising process, from left to right, we formulate the poster layout generation as a process to gradually refine the position and size of boxes from step $T$ to step $i$.
Figure 5: The overview of the VTRAM. As illustrated in the figure, it takes as input text features, RoI features and corresponding coordinates. The coordinate information is first embedded into RoI features to get $V_{ip}$. Next, the scaled dot-product attentionVaswani2017AttentionIA is calculated using the visual position feature $V_{ip}$ as the query, and text features $L$ as the key and value.
...and 6 more figures

Relation-Aware Diffusion Model for Controllable Poster Layout Generation

TL;DR

Abstract

Relation-Aware Diffusion Model for Controllable Poster Layout Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (11)