Table of Contents
Fetching ...

PSDesigner: Automated Graphic Design with a Human-Like Creative Workflow

Xincheng Shuai, Song Tang, Yutong Huang, Henghui Ding, Dacheng Tao

Abstract

Graphic design is a creative and innovative process that plays a crucial role in applications such as e-commerce and advertising. However, developing an automated design system that can faithfully translate user intentions into editable design files remains an open challenge. Although recent studies have leveraged powerful text-to-image models and MLLMs to assist graphic design, they typically simplify professional workflows, resulting in limited flexibility and intuitiveness. To address these limitations, we propose PSDesigner, an automated graphic design system that emulates the creative workflow of human designers. Building upon multiple specialized components, PSDesigner collects theme-related assets based on user instructions, and autonomously infers and executes tool calls to manipulate design files, such as integrating new assets or refining inferior elements. To endow the system with strong tool-use capabilities, we construct a design dataset, CreativePSD, which contains a large amount of high-quality PSD design files annotated with operation traces across a wide range of design scenarios and artistic styles, enabling models to learn expert design procedures. Extensive experiments demonstrate that PSDesigner outperforms existing methods across diverse graphic design tasks, empowering non-specialists to conveniently create production-quality designs.

PSDesigner: Automated Graphic Design with a Human-Like Creative Workflow

Abstract

Graphic design is a creative and innovative process that plays a crucial role in applications such as e-commerce and advertising. However, developing an automated design system that can faithfully translate user intentions into editable design files remains an open challenge. Although recent studies have leveraged powerful text-to-image models and MLLMs to assist graphic design, they typically simplify professional workflows, resulting in limited flexibility and intuitiveness. To address these limitations, we propose PSDesigner, an automated graphic design system that emulates the creative workflow of human designers. Building upon multiple specialized components, PSDesigner collects theme-related assets based on user instructions, and autonomously infers and executes tool calls to manipulate design files, such as integrating new assets or refining inferior elements. To endow the system with strong tool-use capabilities, we construct a design dataset, CreativePSD, which contains a large amount of high-quality PSD design files annotated with operation traces across a wide range of design scenarios and artistic styles, enabling models to learn expert design procedures. Extensive experiments demonstrate that PSDesigner outperforms existing methods across diverse graphic design tasks, empowering non-specialists to conveniently create production-quality designs.

Paper Structure

This paper contains 15 sections, 4 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The figure illustrates the high similarity between the graphic design workflows of human experts (top) and PSDesigner (bottom). They begin by collecting theme-related assets based on the user instructions. Next, they iteratively integrate these assets, where a bottom-up traversal is performed on the nested hierarchy, first at the group level and then at the asset level. In particular, each step consists of planning () and inserting () the current asset, then identifying deficiencies () and performing refinements (). The above steps are repeated until all assets are integrated into the design file.
  • Figure 2: The typical layer hierarchy in PSD (Adobe Photoshop Document) files, where the layers used to compose the same visual concept (e.g., "Left Panel") are grouped together.
  • Figure 3: The three-stage construction pipeline of the proposed design dataset CreativePSD. We first collect high-quality PSD files from the internet and paid data, while grouping the layers based on their underlying visual concepts. Then, we parse the PSD files and extract essential information, such as raw assets, metadata, and intermediate renders. Finally, we use the extracted data to construct the training data for $\mathcal{X}_\text{gen}$ and $\mathcal{X}_\text{edt}$ modes of GraphicPlanner.
  • Figure 4: Evaluation of the model performance on translating user intentions to the final designs. Most of the compared methods can only generate non-editable raster images or output a few layers with simple attributes, hindering their professionalism and flexibility. Furthermore, many methods can not generate accurate texts, especially for complex characters, e.g., Chinese.
  • Figure 5: Evaluation of model performance on the graphic design composition task, using Crello-v5 dataset yamaguchi2021canvasvae. Our method achieves coherent arrangements, achieving visually appealing outcomes.
  • ...and 1 more figures