Table of Contents
Fetching ...

LongStory: Coherent, Complete and Length Controlled Long story Generation

Kyeongman Park, Nakyeong Yang, Kyomin Jung

TL;DR

LongStory tackles the difficulty of long-form story generation with coherence and proper endings. To address this, it introduces two novel components: CWC, a long- and short-term context weight calibrator, and LSP, long story structural positions encoded as discourse tokens, enabling paragraph-by-paragraph generation with variable length. It is trained on three datasets with varying average lengths and evaluated against baselines such as Plotmachine, achieving superior coherence, completeness, relevance, and reduced repetitiveness, including zero-shot analyses. These results, along with the introduction of a completeness metric and dataset-driven evaluation, demonstrate the practical potential of structure-aware, memory-augmented generation for robust long-form text generation.

Abstract

A human author can write any length of story without losing coherence. Also, they always bring the story to a proper ending, an ability that current language models lack. In this work, we present the LongStory for coherent, complete, and length-controlled long story generation. LongStory introduces two novel methodologies: (1) the long and short-term contexts weight calibrator (CWC) and (2) long story structural positions (LSP). The CWC adjusts weights for long-term context Memory and short-term context Cheating, acknowledging their distinct roles. The LSP employs discourse tokens to convey the structural positions of a long story. Trained on three datasets with varied average story lengths, LongStory outperforms other baselines, including the strong story generator Plotmachine, in coherence, completeness, relevance, and repetitiveness. We also perform zero-shot tests on each dataset to assess the model's ability to predict outcomes beyond its training data and validate our methodology by comparing its performance with variants of our model.

LongStory: Coherent, Complete and Length Controlled Long story Generation

TL;DR

LongStory tackles the difficulty of long-form story generation with coherence and proper endings. To address this, it introduces two novel components: CWC, a long- and short-term context weight calibrator, and LSP, long story structural positions encoded as discourse tokens, enabling paragraph-by-paragraph generation with variable length. It is trained on three datasets with varying average lengths and evaluated against baselines such as Plotmachine, achieving superior coherence, completeness, relevance, and reduced repetitiveness, including zero-shot analyses. These results, along with the introduction of a completeness metric and dataset-driven evaluation, demonstrate the practical potential of structure-aware, memory-augmented generation for robust long-form text generation.

Abstract

A human author can write any length of story without losing coherence. Also, they always bring the story to a proper ending, an ability that current language models lack. In this work, we present the LongStory for coherent, complete, and length-controlled long story generation. LongStory introduces two novel methodologies: (1) the long and short-term contexts weight calibrator (CWC) and (2) long story structural positions (LSP). The CWC adjusts weights for long-term context Memory and short-term context Cheating, acknowledging their distinct roles. The LSP employs discourse tokens to convey the structural positions of a long story. Trained on three datasets with varied average story lengths, LongStory outperforms other baselines, including the strong story generator Plotmachine, in coherence, completeness, relevance, and repetitiveness. We also perform zero-shot tests on each dataset to assess the model's ability to predict outcomes beyond its training data and validate our methodology by comparing its performance with variants of our model.
Paper Structure (25 sections, 9 equations, 1 figure, 4 tables)

This paper contains 25 sections, 9 equations, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Model architecture. LongStory takes the keywords of the entire story and the discourse tokens(LSP) representing the order of the target paragraphs as input. The BERT-tiny serves as the long and short-term context weight calibrator (CWC), determining the degree to which long-term and short-term contexts are employed. The CWC takes the discourse tokens and the last generated paragraph as inputs and outputs the optimal $\beta$ and $\gamma$(defined as 1-$\alpha$-$\beta$) for every paragraph. While the $\alpha$ is a hyperparameter applied to input embedding, $\beta$ is a learnable parameter for the long-term context Memory($M^t$) and short-term context Cheating($C^t$)