Table of Contents
Fetching ...

Envisioning Future Interactive Web Development: Editing Webpage with Natural Language

Truong Hai Dang, Jingyu Xiao, Yintong Huo

TL;DR

The paper tackles the challenge of editing existing web UI code via natural language by introducing Instruct4Edit, a fully automated data-generation pipeline that uses LLMs to synthesize instruction–HTML edits and verify visual fidelity. It then demonstrates that fine-tuning open-source models with LoRA on this dataset yields meaningful improvements in translating human intent into structurally coherent and visually accurate edits, achieving competitive performance against larger proprietary systems. The work provides a scalable, transparent foundation for NL-based web editing and releases datasets, code, and model checkpoints to enable reproduction and further research. Its results suggest practical potential for iterative design evolution in web applications, with future directions toward broader front-end frameworks and retrieval-augmented reasoning to enhance robustness and applicability.

Abstract

The evolution of web applications relies on iterative code modifications, a process that is traditionally manual and time-consuming. While Large Language Models (LLMs) can generate UI code, their ability to edit existing code from new design requirements (e.g., "center the logo") remains a challenge. This is largely due to the absence of large-scale, high-quality tuning data to align model performance with human expectations. In this paper, we introduce a novel, automated data generation pipeline that uses LLMs to synthesize a high-quality fine-tuning dataset for web editing, named Instruct4Edit. Our approach generates diverse instructions, applies the corresponding code modifications, and performs visual verification to ensure correctness. By fine-tuning models on Instruct4Edit, we demonstrate consistent improvement in translating human intent into precise, structurally coherent, and visually accurate code changes. This work provides a scalable and transparent foundation for natural language based web editing, demonstrating that fine-tuning smaller open-source models can achieve competitive performance with proprietary systems. We release all data, code implementations, and model checkpoints for reproduction.

Envisioning Future Interactive Web Development: Editing Webpage with Natural Language

TL;DR

The paper tackles the challenge of editing existing web UI code via natural language by introducing Instruct4Edit, a fully automated data-generation pipeline that uses LLMs to synthesize instruction–HTML edits and verify visual fidelity. It then demonstrates that fine-tuning open-source models with LoRA on this dataset yields meaningful improvements in translating human intent into structurally coherent and visually accurate edits, achieving competitive performance against larger proprietary systems. The work provides a scalable, transparent foundation for NL-based web editing and releases datasets, code, and model checkpoints to enable reproduction and further research. Its results suggest practical potential for iterative design evolution in web applications, with future directions toward broader front-end frameworks and retrieval-augmented reasoning to enhance robustness and applicability.

Abstract

The evolution of web applications relies on iterative code modifications, a process that is traditionally manual and time-consuming. While Large Language Models (LLMs) can generate UI code, their ability to edit existing code from new design requirements (e.g., "center the logo") remains a challenge. This is largely due to the absence of large-scale, high-quality tuning data to align model performance with human expectations. In this paper, we introduce a novel, automated data generation pipeline that uses LLMs to synthesize a high-quality fine-tuning dataset for web editing, named Instruct4Edit. Our approach generates diverse instructions, applies the corresponding code modifications, and performs visual verification to ensure correctness. By fine-tuning models on Instruct4Edit, we demonstrate consistent improvement in translating human intent into precise, structurally coherent, and visually accurate code changes. This work provides a scalable and transparent foundation for natural language based web editing, demonstrating that fine-tuning smaller open-source models can achieve competitive performance with proprietary systems. We release all data, code implementations, and model checkpoints for reproduction.

Paper Structure

This paper contains 21 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Example of a Failed HTML Edit Based on a Design Instruction
  • Figure 2: End-to-end Pipeline to synthesize dataset with LLMs
  • Figure 3: Design edit outputs across model variants