From Elements to Design: A Layered Approach for Automatic Graphic Design Composition
Jiawei Lin, Shizhao Sun, Danqing Huang, Ting Liu, Ji Li, Jiang Bian
TL;DR
LaDeCo addresses the challenge of automatic holistic graphic design composition by introducing a layered design principle that organizes input multimodal elements into semantic layers and generates layer-specific attributes with context from previously rendered layers. It combines a layer planning module (utilizing GPT-4o) with a layered design composition process built on Large Multimodal Models, enabling sequential, context-aware design generation. Experiments on Crello datasets show LaDeCo achieving state-of-the-art performance on holistic design composition and outperforming task-specific baselines on subtasks like content-aware layout and typography, with ablations validating the importance of layer planning, layering, and data size. The approach supports flexible subtask handling (e.g., partial layer guidance) and practical applications such as resolution adjustment, element filling, and design variation, with potential for end-to-end content creation when integrated with image-generation models.
Abstract
In this work, we investigate automatic design composition from multimodal graphic elements. Although recent studies have developed various generative models for graphic design, they usually face the following limitations: they only focus on certain subtasks and are far from achieving the design composition task; they do not consider the hierarchical information of graphic designs during the generation process. To tackle these issues, we introduce the layered design principle into Large Multimodal Models (LMMs) and propose a novel approach, called LaDeCo, to accomplish this challenging task. Specifically, LaDeCo first performs layer planning for a given element set, dividing the input elements into different semantic layers according to their contents. Based on the planning results, it subsequently predicts element attributes that control the design composition in a layer-wise manner, and includes the rendered image of previously generated layers into the context. With this insightful design, LaDeCo decomposes the difficult task into smaller manageable steps, making the generation process smoother and clearer. The experimental results demonstrate the effectiveness of LaDeCo in design composition. Furthermore, we show that LaDeCo enables some interesting applications in graphic design, such as resolution adjustment, element filling, design variation, etc. In addition, it even outperforms the specialized models in some design subtasks without any task-specific training.
