MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation
Yan Ma, Yu Qiao, Pengfei Liu
TL;DR
This paper introduces MoPS, a modular approach to automatically synthesize diverse and high-quality story premises by decomposing premises into four sequential modules (theme, background, persona, plot) and linking them through a nested candidate dictionary. A three-phase workflow—inducing module candidates with LLMs, sampling a design path, and synthesizing a premise sentence with self-verification—enables combinatorial creativity while maintaining coherence. MoPS demonstrates superior diversity and quality over baselines, with large-scale datasets (7.6k complete designs, 7.6k premises) and curated subsets, and shows that premises generated by MoPS lead to higher-quality extended stories when integrated into contemporary generation pipelines. The work advances automated story generation by providing reproducible premises with strong diversity metrics and quality guarantees, offering a foundation for cross-modal and large-scale open-ended storytelling systems. It also discusses limitations and future directions, including expanding module kinds and improving evaluation methods for open-ended narratives.
Abstract
A story premise succinctly defines a story's main idea, foundation, and trajectory. It serves as the initial trigger in automatic story generation. Existing sources of story premises are limited by a lack of diversity, uneven quality, and high costs that make them difficult to scale. In response, we introduce Modular Story Premise Synthesis (MoPS) which breaks down story premises into modules like background and persona for automated design and generation. MoPS consists of three phases: (1) Precollect a consistent set of candidates for each module to form a nested dictionary. (2) Extract a key path from the nested dictionary as the premise design. (3) Instruct an LLM to integrate the design into a coherent premise sentence. Thorough evaluations demonstrate that our synthesized premises excel in diversity, fascination, completeness, and originality compared to those induced from large language models and captured from public story datasets. Similarly, the extended novels and scripts generated from our premises also exhibit higher quality. In supplementary materials, we provide the MoPS code suite, along with 7.6k generated premises and 1k extended stories. Code: https://github.com/GAIR-NLP/MoPS.
