Table of Contents
Fetching ...

Zero-Shot Prompting Approaches for LLM-based Graphical User Interface Generation

Kristian Kolthoff, Felix Kretzer, Lennart Fiebig, Christian Bartelt, Alexander Maedche, Simone Paolo Ponzetto

TL;DR

This work tackles the challenge of generating high-fidelity graphical user interfaces without costly model training by leveraging zero-shot prompting. It introduces Retrieval-Augmented GUI Generation (RAGG), Prompt Decomposition for GUI Generation (PDGG), and Self-Critique for GUI Generation (SCGG), pairing large GUI repositories with LLM reasoning and feedback loops. Across extensive human experiments, SCGG consistently yields the most effective GUI prototypes, while RAGG benefits from larger example sets and PDGG offers decomposition advantages. The study demonstrates that zero-shot prompting can substantially accelerate GUI prototyping, improve realism, and reveal LLM-driven defect patterns, with practical implications for rapid, user-centered GUI development.

Abstract

Graphical user interface (GUI) prototyping represents an essential activity in the development of interactive systems, which are omnipresent today. GUI prototypes facilitate elicitation of requirements and help to test, evaluate, and validate ideas with users and the development team. However, creating GUI prototypes is a time-consuming process and often requires extensive resources. While existing research for automatic GUI generation focused largely on resource-intensive training and fine-tuning of LLMs, mainly for low-fidelity GUIs, we investigate the potential and effectiveness of Zero-Shot (ZS) prompting for high-fidelity GUI generation. We propose a Retrieval-Augmented GUI Generation (RAGG) approach, integrated with an LLM-based GUI retrieval re-ranking and filtering mechanism based on a large-scale GUI repository. In addition, we adapt Prompt Decomposition (PDGG) and Self-Critique (SCGG) for GUI generation. To evaluate the effectiveness of the proposed ZS prompting approaches for GUI generation, we extensively evaluated the accuracy and subjective satisfaction of the generated GUI prototypes. Our evaluation, which encompasses over 3,000 GUI annotations from over 100 crowd-workers with UI/UX experience, shows that SCGG, in contrast to PDGG and RAGG, can lead to more effective GUI generation, and provides valuable insights into the defects that are produced by the LLMs in the generated GUI prototypes.

Zero-Shot Prompting Approaches for LLM-based Graphical User Interface Generation

TL;DR

This work tackles the challenge of generating high-fidelity graphical user interfaces without costly model training by leveraging zero-shot prompting. It introduces Retrieval-Augmented GUI Generation (RAGG), Prompt Decomposition for GUI Generation (PDGG), and Self-Critique for GUI Generation (SCGG), pairing large GUI repositories with LLM reasoning and feedback loops. Across extensive human experiments, SCGG consistently yields the most effective GUI prototypes, while RAGG benefits from larger example sets and PDGG offers decomposition advantages. The study demonstrates that zero-shot prompting can substantially accelerate GUI prototyping, improve realism, and reveal LLM-driven defect patterns, with practical implications for rapid, user-centered GUI development.

Abstract

Graphical user interface (GUI) prototyping represents an essential activity in the development of interactive systems, which are omnipresent today. GUI prototypes facilitate elicitation of requirements and help to test, evaluate, and validate ideas with users and the development team. However, creating GUI prototypes is a time-consuming process and often requires extensive resources. While existing research for automatic GUI generation focused largely on resource-intensive training and fine-tuning of LLMs, mainly for low-fidelity GUIs, we investigate the potential and effectiveness of Zero-Shot (ZS) prompting for high-fidelity GUI generation. We propose a Retrieval-Augmented GUI Generation (RAGG) approach, integrated with an LLM-based GUI retrieval re-ranking and filtering mechanism based on a large-scale GUI repository. In addition, we adapt Prompt Decomposition (PDGG) and Self-Critique (SCGG) for GUI generation. To evaluate the effectiveness of the proposed ZS prompting approaches for GUI generation, we extensively evaluated the accuracy and subjective satisfaction of the generated GUI prototypes. Our evaluation, which encompasses over 3,000 GUI annotations from over 100 crowd-workers with UI/UX experience, shows that SCGG, in contrast to PDGG and RAGG, can lead to more effective GUI generation, and provides valuable insights into the defects that are produced by the LLMs in the generated GUI prototypes.

Paper Structure

This paper contains 43 sections, 1 equation, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Overview of our ZS prompting approaches for generating GUI prototypes from natural language requirements (NLR)
  • Figure 2: Overview of the LLM-based content generation
  • Figure 3: Crowd-workers' GUI ratings, generated for baseline (1-3, blue), PDGG (4-6, red), RAGG (7-12, green), SCGG (13-17, orange). A: Feature Compl., B: Feature Width, C: Feature Impl., D: Inform. Organiz., E: Visual Appeal, F: Errors in GUI, G: Overall Satisfaction, H: Complete App. Hatched bars represent models with GUI content generation, triangles represent means.
  • Figure 4: Generated GUI prototypes for (1) ZS instruction, (2) Prompt Decomposition, (3) RAGG$_{k=7}$ and (4) SCGG$_{k=4}$