Table of Contents
Fetching ...

WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation

Zirui Shao, Feiyu Gao, Hangdi Xing, Zepeng Zhu, Zhi Yu, Jiajun Bu, Qi Zheng, Cong Yao

TL;DR

WebRPG addresses the challenge of generating coherent web visual presentations directly from HTML by defining rendering parameters (RPs) that standardize CSS properties. The approach uses a latent generation framework where a VAE compresses RPs per element and HTML embeddings (semantic via MarkupLM, hierarchical via XPath, and character-count) guide autoregressive or diffusion decoders to produce the RPs, which are then decoded back to rendering CSS. A new Klarna-derived dataset of 88,418 sub-pages supports offline rendering and robust evaluation with metrics including Fréchet Inception Distance, Element IoU, and a novel Style Consistency Score; experiments show autoregressive WebRPG-AR generally outperforms diffusion-based WebRPG-DM, with GPT-4 providing additional qualitative gains. The work demonstrates the feasibility of automated web design workflows from HTML, highlights important design-knowledge transfer via HTML embeddings, and lays groundwork for future integration with large language models and CSS frameworks to broaden applicability and practicality.

Abstract

In the era of content creation revolution propelled by advancements in generative models, the field of web design remains unexplored despite its critical role in modern digital communication. The web design process is complex and often time-consuming, especially for those with limited expertise. In this paper, we introduce Web Rendering Parameters Generation (WebRPG), a new task that aims at automating the generation for visual presentation of web pages based on their HTML code. WebRPG would contribute to a faster web development workflow. Since there is no existing benchmark available, we develop a new dataset for WebRPG through an automated pipeline. Moreover, we present baseline models, utilizing VAE to manage numerous elements and rendering parameters, along with custom HTML embedding for capturing essential semantic and hierarchical information from HTML. Extensive experiments, including customized quantitative evaluations for this specific task, are conducted to evaluate the quality of the generated results.

WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation

TL;DR

WebRPG addresses the challenge of generating coherent web visual presentations directly from HTML by defining rendering parameters (RPs) that standardize CSS properties. The approach uses a latent generation framework where a VAE compresses RPs per element and HTML embeddings (semantic via MarkupLM, hierarchical via XPath, and character-count) guide autoregressive or diffusion decoders to produce the RPs, which are then decoded back to rendering CSS. A new Klarna-derived dataset of 88,418 sub-pages supports offline rendering and robust evaluation with metrics including Fréchet Inception Distance, Element IoU, and a novel Style Consistency Score; experiments show autoregressive WebRPG-AR generally outperforms diffusion-based WebRPG-DM, with GPT-4 providing additional qualitative gains. The work demonstrates the feasibility of automated web design workflows from HTML, highlights important design-knowledge transfer via HTML embeddings, and lays groundwork for future integration with large language models and CSS frameworks to broaden applicability and practicality.

Abstract

In the era of content creation revolution propelled by advancements in generative models, the field of web design remains unexplored despite its critical role in modern digital communication. The web design process is complex and often time-consuming, especially for those with limited expertise. In this paper, we introduce Web Rendering Parameters Generation (WebRPG), a new task that aims at automating the generation for visual presentation of web pages based on their HTML code. WebRPG would contribute to a faster web development workflow. Since there is no existing benchmark available, we develop a new dataset for WebRPG through an automated pipeline. Moreover, we present baseline models, utilizing VAE to manage numerous elements and rendering parameters, along with custom HTML embedding for capturing essential semantic and hierarchical information from HTML. Extensive experiments, including customized quantitative evaluations for this specific task, are conducted to evaluate the quality of the generated results.
Paper Structure (37 sections, 9 equations, 18 figures, 9 tables)

This paper contains 37 sections, 9 equations, 18 figures, 9 tables.

Figures (18)

  • Figure 1: Overview of the WebRPG task. The input consists of plain HTML code and the output comprises rendering parameters for each element. With browser rendering, plain HTML produces a disorganized visual presentation, while incorporating the generated rendering parameters significantly enhances the visual presentation.
  • Figure 2: Selected sub-page screenshots from our dataset. Notably, regions displayed are cropped due to space limitations.
  • Figure 3: Key components of WebRPG models. In the upper left, VAE compresses the RPs of each element into latent vectors shown in blue. In the top right, "Semantic" (Sem), "Hierarchical" (Hier), and "Character Count" (CharC) embeddings combine into the HTML embedding in orange. Below, two generative models are illustrated.
  • Figure 4: Qualitative comparison of WebRPG baselines.
  • Figure 5: Case visualization from the ablation study.
  • ...and 13 more figures