OpenCOLE: Towards Reproducible Automatic Graphic Design Generation
Naoto Inoue, Kento Masui, Wataru Shimoda, Kota Yamaguchi
TL;DR
OpenCOLE tackles reproducibility in automatic graphic design by delivering an open-source pipeline trained exclusively on public data (Crello) and releasing results to foster community development. It preserves COLE’s architecture—design-plan generation, image synthesis, typography generation, and rendering—while adapting components to public datasets and open-model pipelines. GPT4V-based evaluation on the DESIGNERINTENTION benchmark indicates OpenCOLE achieves performance close to COLE, though state-of-the-art text-to-image systems like SDXL1.0 and DALL-E3 still outperform it. This work demonstrates the feasibility of democratizing automated graphic-design tools while highlighting the need for open, robust evaluation frameworks and continued improvements toward higher-quality, editable designs.
Abstract
Automatic generation of graphic designs has recently received considerable attention. However, the state-of-the-art approaches are complex and rely on proprietary datasets, which creates reproducibility barriers. In this paper, we propose an open framework for automatic graphic design called OpenCOLE, where we build a modified version of the pioneering COLE and train our model exclusively on publicly available datasets. Based on GPT4V evaluations, our model shows promising performance comparable to the original COLE. We release the pipeline and training results to encourage open development.
