A Character-Centric Creative Story Generation via Imagination
Kyeongman Park, Minbeom Kim, Kyomin Jung
TL;DR
CCI tackles limitations in creative storytelling by introducing Image-Guided Imagination (IG) and Multi-Writer (MW) to craft richer story elements and deeper protagonist personas. IG produces diverse visual representations that feed into textual element descriptions, while MW injects multiple persona candidates and uses a continuation-score to pick the most coherent, vivid injection. The Specification step tightly integrates IG and MW, and experiments show improved diversity, persona relevance, and overall creativity compared with baselines, including DOC, across human and LLM evaluations. The framework supports interactive multimodal storytelling without requiring extensive LLM fine-tuning, offering a practical pathway for richer, user-aligned narratives in cultural development and entertainment contexts.
Abstract
Creative story generation has long been a goal of NLP research. While existing methodologies have aimed to generate long and coherent stories, they fall significantly short of human capabilities in terms of diversity and character depth. To address this, we introduce a novel story generation framework called CCI (Character-centric Creative story generation via Imagination). CCI features two modules for creative story generation: IG (Image-Guided Imagination) and MW (Multi-Writer model). In the IG module, we utilize a text-to-image model to create visual representations of key story elements, such as characters, backgrounds, and main plots, in a more novel and concrete manner than text-only approaches. The MW module uses these story elements to generate multiple persona-description candidates and selects the best one to insert into the story, thereby enhancing the richness and depth of the narrative. We compared the stories generated by CCI and baseline models through statistical analysis, as well as human and LLM evaluations. The results showed that the IG and MW modules significantly improve various aspects of the stories' creativity. Furthermore, our framework enables interactive multi-modal story generation with users, opening up new possibilities for human-LLM integration in cultural development. Project page : https://www.2024cci.p-e.kr/
