Table of Contents
Fetching ...

PSD2Code: Automated Front-End Code Generation from Design Files via Multimodal Large Language Models

Yongxi Chen, Lei Chen

TL;DR

PSD2Code introduces a PSD-first Parse–Align–Generate pipeline that directly extracts hierarchical structures, coordinates, and asset references from PSD design files to constrain multimodal code generation. By integrating structured PSD metadata with asset alignment and a three-fragment prompt, the method yields production-ready React+SCSS code with strong visual fidelity and correct resource usage, validated across multiple LLM backbones. Extensive experiments on a real-world PSD dataset show significant improvements in code similarity, visual similarity, and executability over state-of-the-art baselines, with robust model-agnostic performance and insightful ablations underscoring the value of parsing and constraint-based generation. The work demonstrates a practical, scalable path toward design-driven automated frontend development and emphasizes model-agnostic, asset-aware generation for industrial deployment.

Abstract

Design-to-code generation has emerged as a promising approach to bridge the gap between design prototypes and deployable frontend code. However, existing methods often suffer from structural inconsistencies, asset misalignment, and limited production readiness. This paper presents PSD2Code, a novel multi-modal approach that leverages PSD file parsing and asset alignment to generate production-ready React+SCSS code. Our method introduces a ParseAlignGenerate pipeline that extracts hierarchical structures, layer properties, and metadata from PSD files, providing large language models with precise spatial relationships and semantic groupings for frontend code generation. The system employs a constraint-based alignment strategy that ensures consistency between generated elements and design resources, while a structured prompt construction enhances controllability and code quality. Comprehensive evaluation demonstrates significant improvements over existing methods across multiple metrics including code similarity, visual fidelity, and production readiness. The method exhibits strong model independence across different large language models, validating the effectiveness of integrating structured design information with multimodal large language models for industrial-grade code generation, marking an important step toward design-driven automated frontend development.

PSD2Code: Automated Front-End Code Generation from Design Files via Multimodal Large Language Models

TL;DR

PSD2Code introduces a PSD-first Parse–Align–Generate pipeline that directly extracts hierarchical structures, coordinates, and asset references from PSD design files to constrain multimodal code generation. By integrating structured PSD metadata with asset alignment and a three-fragment prompt, the method yields production-ready React+SCSS code with strong visual fidelity and correct resource usage, validated across multiple LLM backbones. Extensive experiments on a real-world PSD dataset show significant improvements in code similarity, visual similarity, and executability over state-of-the-art baselines, with robust model-agnostic performance and insightful ablations underscoring the value of parsing and constraint-based generation. The work demonstrates a practical, scalable path toward design-driven automated frontend development and emphasizes model-agnostic, asset-aware generation for industrial deployment.

Abstract

Design-to-code generation has emerged as a promising approach to bridge the gap between design prototypes and deployable frontend code. However, existing methods often suffer from structural inconsistencies, asset misalignment, and limited production readiness. This paper presents PSD2Code, a novel multi-modal approach that leverages PSD file parsing and asset alignment to generate production-ready React+SCSS code. Our method introduces a ParseAlignGenerate pipeline that extracts hierarchical structures, layer properties, and metadata from PSD files, providing large language models with precise spatial relationships and semantic groupings for frontend code generation. The system employs a constraint-based alignment strategy that ensures consistency between generated elements and design resources, while a structured prompt construction enhances controllability and code quality. Comprehensive evaluation demonstrates significant improvements over existing methods across multiple metrics including code similarity, visual fidelity, and production readiness. The method exhibits strong model independence across different large language models, validating the effectiveness of integrating structured design information with multimodal large language models for industrial-grade code generation, marking an important step toward design-driven automated frontend development.

Paper Structure

This paper contains 32 sections, 3 figures, 7 tables.

Figures (3)

  • Figure 1: Limitations of large-model-based UI-to-code generation in industrial settings.
  • Figure 2: System overview of the proposed Parse-Align-Generate framework.
  • Figure 3: From left to right: original PSD design, CodeFun output, Screenshot-to-code output, and our method's result. Our approach demonstrates superior accuracy in layout reconstruction and style fidelity compared to baseline methods.