Table of Contents
Fetching ...

MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation

Yan Ma, Yu Qiao, Pengfei Liu

TL;DR

This paper introduces MoPS, a modular approach to automatically synthesize diverse and high-quality story premises by decomposing premises into four sequential modules (theme, background, persona, plot) and linking them through a nested candidate dictionary. A three-phase workflow—inducing module candidates with LLMs, sampling a design path, and synthesizing a premise sentence with self-verification—enables combinatorial creativity while maintaining coherence. MoPS demonstrates superior diversity and quality over baselines, with large-scale datasets (7.6k complete designs, 7.6k premises) and curated subsets, and shows that premises generated by MoPS lead to higher-quality extended stories when integrated into contemporary generation pipelines. The work advances automated story generation by providing reproducible premises with strong diversity metrics and quality guarantees, offering a foundation for cross-modal and large-scale open-ended storytelling systems. It also discusses limitations and future directions, including expanding module kinds and improving evaluation methods for open-ended narratives.

Abstract

A story premise succinctly defines a story's main idea, foundation, and trajectory. It serves as the initial trigger in automatic story generation. Existing sources of story premises are limited by a lack of diversity, uneven quality, and high costs that make them difficult to scale. In response, we introduce Modular Story Premise Synthesis (MoPS) which breaks down story premises into modules like background and persona for automated design and generation. MoPS consists of three phases: (1) Precollect a consistent set of candidates for each module to form a nested dictionary. (2) Extract a key path from the nested dictionary as the premise design. (3) Instruct an LLM to integrate the design into a coherent premise sentence. Thorough evaluations demonstrate that our synthesized premises excel in diversity, fascination, completeness, and originality compared to those induced from large language models and captured from public story datasets. Similarly, the extended novels and scripts generated from our premises also exhibit higher quality. In supplementary materials, we provide the MoPS code suite, along with 7.6k generated premises and 1k extended stories. Code: https://github.com/GAIR-NLP/MoPS.

MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation

TL;DR

This paper introduces MoPS, a modular approach to automatically synthesize diverse and high-quality story premises by decomposing premises into four sequential modules (theme, background, persona, plot) and linking them through a nested candidate dictionary. A three-phase workflow—inducing module candidates with LLMs, sampling a design path, and synthesizing a premise sentence with self-verification—enables combinatorial creativity while maintaining coherence. MoPS demonstrates superior diversity and quality over baselines, with large-scale datasets (7.6k complete designs, 7.6k premises) and curated subsets, and shows that premises generated by MoPS lead to higher-quality extended stories when integrated into contemporary generation pipelines. The work advances automated story generation by providing reproducible premises with strong diversity metrics and quality guarantees, offering a foundation for cross-modal and large-scale open-ended storytelling systems. It also discusses limitations and future directions, including expanding module kinds and improving evaluation methods for open-ended narratives.

Abstract

A story premise succinctly defines a story's main idea, foundation, and trajectory. It serves as the initial trigger in automatic story generation. Existing sources of story premises are limited by a lack of diversity, uneven quality, and high costs that make them difficult to scale. In response, we introduce Modular Story Premise Synthesis (MoPS) which breaks down story premises into modules like background and persona for automated design and generation. MoPS consists of three phases: (1) Precollect a consistent set of candidates for each module to form a nested dictionary. (2) Extract a key path from the nested dictionary as the premise design. (3) Instruct an LLM to integrate the design into a coherent premise sentence. Thorough evaluations demonstrate that our synthesized premises excel in diversity, fascination, completeness, and originality compared to those induced from large language models and captured from public story datasets. Similarly, the extended novels and scripts generated from our premises also exhibit higher quality. In supplementary materials, we provide the MoPS code suite, along with 7.6k generated premises and 1k extended stories. Code: https://github.com/GAIR-NLP/MoPS.
Paper Structure (35 sections, 2 equations, 8 figures, 25 tables)

This paper contains 35 sections, 2 equations, 8 figures, 25 tables.

Figures (8)

  • Figure 1: Overview of MoPS. We divide the premise into four ordered modules: graytheme, redbackground, greenpersona, and blueplot, with each module further divided into submodules. From the top down, arrows indicate the dependency relationships within and between modules.
  • Figure 2: Case study on premise synthesis demonstrates LLM's ability to extract core information from modules and integrate them into a cohesive final premise, effectively encapsulating the sampled module path.
  • Figure 3: Diversity Metrics. Breadth score, shown top left, measures the polygon area from 2D semantic embedding vectors. Density score, displayed top right, calculates the standard deviation within the polygon from a 2D histogram. Examples (A, B, C) illustrate that reduced-dimension embeddings effectively capture semantic similarity.
  • Figure 4: Breadth score of all methods. The premises synthesized by MoPS surpassed comparative methods in semantic breadth. Note: Chrome or Edge browser may not display this figure properly. Please use a specialized PDF viewer.
  • Figure 5: Density score of all methods. The premises synthesized by MoPS surpassed comparative methods in semantic density.
  • ...and 3 more figures