Table of Contents
Fetching ...

StoryDiffusion: How to Support UX Storyboarding With Generative-AI

Zhaohui Liang, Xiaoyu Zhang, Kevin Ma, Zhao Liu, Xipei Ren, Kosa Goucher-Lambert, Can Liu

TL;DR

StoryDiffusion tackles the problem of enabling end-to-end UX storyboard creation by integrating GPT-4-based narrative processing with Stable Diffusion image generation in a single workflow. It introduces a three-step prompting pipeline that segments narratives into frames and generates consistent, editable prompts to produce coherent storyboard frames, with a flexible interface for narrative and style adjustments. A user study with 12 UX design students reveals two creative strategies (user-directed and AI-directed) and shows improved ideation and visualization, along with requirements for narrative clarity, visual continuity, and human-AI communication. The work contributes a practical, full-stack prompting solution and provides design implications for AI-assisted storytelling tools, highlighting the trade-offs between precision and exploratory creativity and outlining directions for future enhancements in accuracy, continuity, and editing capabilities.

Abstract

Storyboarding is an established method for designing user experiences. Generative AI can support this process by helping designers quickly create visual narratives. However, existing tools only focus on accurate text-to-image generation. Currently, it is not clear how to effectively support the entire creative process of storyboarding and how to develop AI-powered tools to support designers' individual workflows. In this work, we iteratively developed and implemented StoryDiffusion, a system that integrates text-to-text and text-to-image models, to support the generation of narratives and images in a single pipeline. With a user study, we observed 12 UX designers using the system for both concept ideation and illustration tasks. Our findings identified AI-directed vs. user-directed creative strategies in both tasks and revealed the importance of supporting the interchange between narrative iteration and image generation. We also found effects of the design tasks on their strategies and preferences, providing insights for future development.

StoryDiffusion: How to Support UX Storyboarding With Generative-AI

TL;DR

StoryDiffusion tackles the problem of enabling end-to-end UX storyboard creation by integrating GPT-4-based narrative processing with Stable Diffusion image generation in a single workflow. It introduces a three-step prompting pipeline that segments narratives into frames and generates consistent, editable prompts to produce coherent storyboard frames, with a flexible interface for narrative and style adjustments. A user study with 12 UX design students reveals two creative strategies (user-directed and AI-directed) and shows improved ideation and visualization, along with requirements for narrative clarity, visual continuity, and human-AI communication. The work contributes a practical, full-stack prompting solution and provides design implications for AI-assisted storytelling tools, highlighting the trade-offs between precision and exploratory creativity and outlining directions for future enhancements in accuracy, continuity, and editing capabilities.

Abstract

Storyboarding is an established method for designing user experiences. Generative AI can support this process by helping designers quickly create visual narratives. However, existing tools only focus on accurate text-to-image generation. Currently, it is not clear how to effectively support the entire creative process of storyboarding and how to develop AI-powered tools to support designers' individual workflows. In this work, we iteratively developed and implemented StoryDiffusion, a system that integrates text-to-text and text-to-image models, to support the generation of narratives and images in a single pipeline. With a user study, we observed 12 UX designers using the system for both concept ideation and illustration tasks. Our findings identified AI-directed vs. user-directed creative strategies in both tasks and revealed the importance of supporting the interchange between narrative iteration and image generation. We also found effects of the design tasks on their strategies and preferences, providing insights for future development.
Paper Structure (46 sections, 7 figures, 2 tables)

This paper contains 46 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Workflow of the experiment used in the formative study. Users inputted their stories into ChatGPT to generate prompts, and then copied the prompts into a Stable Diffusion model to obtain images. System prompts for ChatGPT were given by us to define the role GPT-4 will play, which involved segmenting the story into several scenes based on the user's required number and generating corresponding prompts for the stable diffusion model.
  • Figure 2: Storyboards regenerated by (a)FP02 and (b)FP05. Images in the first row were the storyboards drawn by the participants themselves, and images in the third row were the storyboards regenerated using text-to-image tools. The texts between two rows were the sentences used to generate prompts for each image.
  • Figure 3: Parameters and Co-creation Pipeline. GPT-4 first completes the story description provided by the designer, then outputs an overarching story setting, establishing the style parameters. Subsequently, based on this setting, the story is divided into a specified number of scenes, with corresponding parameters determined. Before image generation, prompt level parameters are further added to the scenes. Once all parameters are set, the diffusion model transforms the prompts into a storyboard. Designers can oversee the entire generation process, interrupting at any stage to make modifications using natural language.
  • Figure 4: Test of StoryDiffusion on two stories. We conducted tests on two UX stories, with the models and prompts for each line indicated in the figure. Natural language prompts were generated using the method shown in the formative study, while parameterized prompts were generated using the StoryDiffusion system. All images generated by the Stable Diffusion model used the same model. The images processed through our system exhibited richer details, more prominent themes, and stronger coherence across both models.
  • Figure 5: Figma template of storyboard. In each task, participants first read the task requirements, then used our system for creation, and finally copied the images into Figma and completed the captions.
  • ...and 2 more figures