Table of Contents
Fetching ...

A Character-Centric Creative Story Generation via Imagination

Kyeongman Park, Minbeom Kim, Kyomin Jung

TL;DR

CCI tackles limitations in creative storytelling by introducing Image-Guided Imagination (IG) and Multi-Writer (MW) to craft richer story elements and deeper protagonist personas. IG produces diverse visual representations that feed into textual element descriptions, while MW injects multiple persona candidates and uses a continuation-score to pick the most coherent, vivid injection. The Specification step tightly integrates IG and MW, and experiments show improved diversity, persona relevance, and overall creativity compared with baselines, including DOC, across human and LLM evaluations. The framework supports interactive multimodal storytelling without requiring extensive LLM fine-tuning, offering a practical pathway for richer, user-aligned narratives in cultural development and entertainment contexts.

Abstract

Creative story generation has long been a goal of NLP research. While existing methodologies have aimed to generate long and coherent stories, they fall significantly short of human capabilities in terms of diversity and character depth. To address this, we introduce a novel story generation framework called CCI (Character-centric Creative story generation via Imagination). CCI features two modules for creative story generation: IG (Image-Guided Imagination) and MW (Multi-Writer model). In the IG module, we utilize a text-to-image model to create visual representations of key story elements, such as characters, backgrounds, and main plots, in a more novel and concrete manner than text-only approaches. The MW module uses these story elements to generate multiple persona-description candidates and selects the best one to insert into the story, thereby enhancing the richness and depth of the narrative. We compared the stories generated by CCI and baseline models through statistical analysis, as well as human and LLM evaluations. The results showed that the IG and MW modules significantly improve various aspects of the stories' creativity. Furthermore, our framework enables interactive multi-modal story generation with users, opening up new possibilities for human-LLM integration in cultural development. Project page : https://www.2024cci.p-e.kr/

A Character-Centric Creative Story Generation via Imagination

TL;DR

CCI tackles limitations in creative storytelling by introducing Image-Guided Imagination (IG) and Multi-Writer (MW) to craft richer story elements and deeper protagonist personas. IG produces diverse visual representations that feed into textual element descriptions, while MW injects multiple persona candidates and uses a continuation-score to pick the most coherent, vivid injection. The Specification step tightly integrates IG and MW, and experiments show improved diversity, persona relevance, and overall creativity compared with baselines, including DOC, across human and LLM evaluations. The framework supports interactive multimodal storytelling without requiring extensive LLM fine-tuning, offering a practical pathway for richer, user-aligned narratives in cultural development and entertainment contexts.

Abstract

Creative story generation has long been a goal of NLP research. While existing methodologies have aimed to generate long and coherent stories, they fall significantly short of human capabilities in terms of diversity and character depth. To address this, we introduce a novel story generation framework called CCI (Character-centric Creative story generation via Imagination). CCI features two modules for creative story generation: IG (Image-Guided Imagination) and MW (Multi-Writer model). In the IG module, we utilize a text-to-image model to create visual representations of key story elements, such as characters, backgrounds, and main plots, in a more novel and concrete manner than text-only approaches. The MW module uses these story elements to generate multiple persona-description candidates and selects the best one to insert into the story, thereby enhancing the richness and depth of the narrative. We compared the stories generated by CCI and baseline models through statistical analysis, as well as human and LLM evaluations. The results showed that the IG and MW modules significantly improve various aspects of the stories' creativity. Furthermore, our framework enables interactive multi-modal story generation with users, opening up new possibilities for human-LLM integration in cultural development. Project page : https://www.2024cci.p-e.kr/
Paper Structure (51 sections, 3 figures, 32 tables)

This paper contains 51 sections, 3 figures, 32 tables.

Figures (3)

  • Figure 1: Comparison between DOC and Our CCI-Story Approach. DOC generates stories on similar topics in a monotonous manner. In contrast, our work leverages images to create stories that are not only diverse and creative in their themes but also richer in content, centered around the persona of the main character.
  • Figure 2: Our two main modules, IG and MW. In the IG module, we generate story elements using DALL-E 3 and then create the protagonist's persona through an iterative question-and-answer process. In the next step, the MW module generates multiple candidate descriptions for each ending of the initial paragraph, ensuring they reflect the protagonist's specific persona attributes. During this process, MW filters out candidates that are either too similar to previous sentences or deviate too far from the persona information (filtering examples are in the Behavioral Habit box). Finally, MW ranks the filtered candidates and selects the best one to continue the initial paragraph, using the CS (referred to as C-Score in the figure).
  • Figure 3: Example of Multimodal Interactive Story. We bold and underline the sentences that from the Protagonist, Background, and Climax images.