Graphic Design with Large Multimodal Model
Yutao Cheng, Zhao Zhang, Maoke Yang, Hui Nie, Chunyuan Li, Xinglong Wu, Jie Shao
TL;DR
This paper introduces Hierarchical Layout Generation (HLG) to relax the predefined layer ordering in Graphic Layout Generation (GLG) and enable cohesive designs from unordered design elements. It proposes Graphist, the first end-to-end, large multimodal model-based layout generator that ingests RGB-A inputs and outputs a JSON protocol describing element coordinates, sizes, and hierarchy. To evaluate HLG, the authors develop Inverse Order Pair Ratio (IOPR) and GPT-4V Eval, demonstrating state-of-the-art performance on GLG and HLG tasks across Crello and CGL-V2 datasets, along with real-world and ablation studies. The work highlights Graphist’s potential to democratize graphic design by enabling flexible, automated composition while outlining limitations and avenues for reducing design homogeneity and environmental impact.
Abstract
In the field of graphic design, automating the integration of design elements into a cohesive multi-layered artwork not only boosts productivity but also paves the way for the democratization of graphic design. One existing practice is Graphic Layout Generation (GLG), which aims to layout sequential design elements. It has been constrained by the necessity for a predefined correct sequence of layers, thus limiting creative potential and increasing user workload. In this paper, we present Hierarchical Layout Generation (HLG) as a more flexible and pragmatic setup, which creates graphic composition from unordered sets of design elements. To tackle the HLG task, we introduce Graphist, the first layout generation model based on large multimodal models. Graphist efficiently reframes the HLG as a sequence generation problem, utilizing RGB-A images as input, outputs a JSON draft protocol, indicating the coordinates, size, and order of each element. We develop new evaluation metrics for HLG. Graphist outperforms prior arts and establishes a strong baseline for this field. Project homepage: https://github.com/graphic-design-ai/graphist
