WebRenderBench: Enhancing Web Interface Generation through Layout-Style Consistency and Reinforcement Learning

Peichao Lai; Jinhui Zhuang; Kexuan Zhang; Ningchang Xiong; Shengjie Wang; Yanwei Xu; Chong Chen; Yilei Wang; Bin Cui

WebRenderBench: Enhancing Web Interface Generation through Layout-Style Consistency and Reinforcement Learning

Peichao Lai, Jinhui Zhuang, Kexuan Zhang, Ningchang Xiong, Shengjie Wang, Yanwei Xu, Chong Chen, Yilei Wang, Bin Cui

TL;DR

WebRenderBench addresses the core challenges of WebUI-to-Code by providing a large-scale, real-world dataset and a robust, render-based metric to assess layout and style fidelity. It introduces ALISA, an automated agent that injects the proposed RDA, GDA, and SDA metrics as reinforcement learning rewards to improve code generation on asymmetric webpages. Empirical results show state-of-the-art performance across multiple metrics and demonstrate the importance of layout alignment, with ablations confirming the benefits of combining layout and style signals. This work establishes a practical benchmark and RL framework that enables more reliable, objective assessment and enhancement of WebUI-to-Code generation for real-world web designs.

Abstract

Automating the conversion of UI images into web code is a critical task for front-end development and rapid prototyping. Advances in multimodal large language models (MLLMs) have made WebUI-to-Code increasingly feasible, yet existing benchmarks remain limited in data diversity and evaluation reliability. To address these issues, we present WebRenderBench, a large-scale benchmark of 45.1k webpages collected from real-world portal sites, offering greater diversity, complexity, and realism than prior benchmarks. We further propose a novel evaluation metric that measures layout and style consistency from the final rendered pages. Unlike vision-based methods that rely on costly LLM reasoning or structure-based comparisons vulnerable to noise and asymmetry, our approach enables more efficient, objective, and reliable UI quality assessment. Finally, we introduce the Automated Layout and Style Inspection Agent (ALISA), which integrates this metric into reinforcement learning as a reward signal to enhance training on crawled asymmetric webpages. Experiments show that ALISA significantly boosts generation performance, achieving state-of-the-art results across multiple metrics.

WebRenderBench: Enhancing Web Interface Generation through Layout-Style Consistency and Reinforcement Learning

TL;DR

Abstract

WebRenderBench: Enhancing Web Interface Generation through Layout-Style Consistency and Reinforcement Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)