ID.8: Co-Creating Visual Stories with Generative AI

Victor Nikhil Antony; Chien-Ming Huang

ID.8: Co-Creating Visual Stories with Generative AI

Victor Nikhil Antony, Chien-Ming Huang

TL;DR

This paper presents ID.8, an open-source, end-to-end visual story authoring system that unifies text, visuals, and audio generation through a human-in-the-loop workflow. By coordinating a Storyline Creator with Leela (an LLM), a Storyboard, and a Scene Editor that leverages Stable Diffusion, AudioGen, and MusicGen, the authors demonstrate how multimodal generative AI can support iterative, co-creative storytelling. Two user studies reveal generally positive usability and creative exploration, but also highlight gaps in immersion, alignment, and perceived collaboration, guiding design improvements. The work contributes not only an operational platform but also design guidelines for future multimodal, co-creative systems and emphasizes the importance of user-friendly prompting, safety, and cohesive AI identity in human-AI collaboration.

Abstract

Storytelling is an integral part of human culture and significantly impacts cognitive and socio-emotional development and connection. Despite the importance of interactive visual storytelling, the process of creating such content requires specialized skills and is labor-intensive. This paper introduces ID.8, an open-source system designed for the co-creation of visual stories with generative AI. We focus on enabling an inclusive storytelling experience by simplifying the content creation process and allowing for customization. Our user evaluation confirms a generally positive user experience in domains such as enjoyment and exploration, while highlighting areas for improvement, particularly in immersiveness, alignment, and partnership between the user and the AI system. Overall, our findings indicate promising possibilities for empowering people to create visual stories with generative AI. This work contributes a novel content authoring system, ID.8, and insights into the challenges and potential of using generative AI for multimedia content creation.

ID.8: Co-Creating Visual Stories with Generative AI

TL;DR

Abstract

Paper Structure (41 sections, 7 figures, 3 tables)

This paper contains 41 sections, 7 figures, 3 tables.

Introduction
Related Work
Story Authoring Tools
Generative AI and Content Generation
Human-AI Co-Creative Systems
ID.8: An Integrated Authoring System for Visual Story Creation
System Overview
Storyline Creator
Storyboard
Scene Editor
Evaluation
Study Measures
Study 1: Evaluating Usability
Study Procedure and Task
Participants
...and 26 more sections

Figures (7)

Figure 1: ID.8 enables generation of a story (1) by collaborating with ChatGPT, and also allows the user (2) to manually edit the story and then (3) generates---using ChatGPT---a structured script and pre-populates the storyboard with scenes from the script.
Figure 2: ID.8 Storyboard allows for organization of the story flow by linking scenes and specifying how story viewer inputs should impact the flow of the story. Users access the Scene Editor module by double-clicking a scene node. Users can also preview their story.
Figure 3: (1)The ID.8 Scene Editor enables creation of prompts for text-to-image/audio models in collaboration with ChatGPT; (2) For character generation, ID.8 empowers users to select parts of the generated output to be used in the story; (3) ID.8 provides a simple interface for adding interaction with viewer.
Figure 4: Results from Study 1: (a) SUS Scores, (b) MICSI Sub-Scale Scores, (c) Exploratory Question Responses.
Figure 5: Scenes from stories generated by participants using ID.8 in Study 1 and Study 2.
...and 2 more figures

ID.8: Co-Creating Visual Stories with Generative AI

TL;DR

Abstract

ID.8: Co-Creating Visual Stories with Generative AI

Authors

TL;DR

Abstract

Table of Contents

Figures (7)