PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

Yuheng Feng; Wen Zhang; Haodong Duan; Xingxing Zou

PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

Yuheng Feng, Wen Zhang, Haodong Duan, Xingxing Zou

Abstract

We present PosterIQ, a design-driven benchmark for poster understanding and generation, annotated across composition structure, typographic hierarchy, and semantic intent. It includes 7,765 image-annotation instances and 822 generation prompts spanning real, professional, and synthetic cases. To bridge visual design cognition and generative modeling, we define tasks for layout parsing, text-image correspondence, typography/readability and font perception, design quality assessment, and controllable, composition-aware generation with metaphor. We evaluate state-of-the-art MLLMs and diffusion-based generators, finding persistent gaps in visual hierarchy, typographic semantics, saliency control, and intention communication; commercial models lead on high-level reasoning but act as insensitive automatic raters, while generators render text well yet struggle with composition-aware synthesis. Extensive analyses show PosterIQ is both a quantitative benchmark and a diagnostic tool for design reasoning, offering reproducible, task-specific metrics. We aim to catalyze models' creativity and integrate human-centred design principles into generative vision-language systems.

PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

Abstract

Paper Structure (17 sections, 15 equations, 28 figures, 7 tables)

This paper contains 17 sections, 15 equations, 28 figures, 7 tables.

Introduction
Related Work
Multimodal Benchmarks
Poster Design and Generative Models
Benchmark
Understanding Tasks
Generation Tasks
Experiment
Understanding Task
Generation Task
Conclusion
Benchmark Statistics
Task Evaluation
Human Evaluation
Annotator Guideline
...and 2 more sections

Figures (28)

Figure 1: Overview of the benchmark, which includes over a dozen tasks
Figure 2: Qualitative comparison of four models on three layout-related tasks. For Text Localization and Layout Generation, the predicted bounding boxes are shown in red. For the Empty Space task, the selected patch IDs are highlighted in the image.
Figure 3: Qualitative comparison of four models on five generation tasks.
Figure 4: Qualitative comparison of model outputs over supervision-guided iterations.
Figure 5: Benchmark statistics for understanding tasks (top) and generation tasks (bottom).
...and 23 more figures

PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

Abstract

PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation

Authors

Abstract

Table of Contents

Figures (28)