Table of Contents
Fetching ...

ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models

Jonas Belouadi, Steffen Eger

TL;DR

This work successfully pre-train ByGPT5, a new token-free decoder-only language model, and fine-tune it on a large custom corpus of English and German quatrains annotated with the authors' styles, and shows that ByG PT5 outperforms other models such as mT5, ByT4, GPT-2 and ChatGPT, while also being more parameter efficient and performing favorably compared to humans.

Abstract

State-of-the-art poetry generation systems are often complex. They either consist of task-specific model pipelines, incorporate prior knowledge in the form of manually created constraints, or both. In contrast, end-to-end models would not suffer from the overhead of having to model prior knowledge and could learn the nuances of poetry from data alone, reducing the degree of human supervision required. In this work, we investigate end-to-end poetry generation conditioned on styles such as rhyme, meter, and alliteration. We identify and address lack of training data and mismatching tokenization algorithms as possible limitations of past attempts. In particular, we successfully pre-train ByGPT5, a new token-free decoder-only language model, and fine-tune it on a large custom corpus of English and German quatrains annotated with our styles. We show that ByGPT5 outperforms other models such as mT5, ByT5, GPT-2 and ChatGPT, while also being more parameter efficient and performing favorably compared to humans. In addition, we analyze its runtime performance and demonstrate that it is not prone to memorization. We make our code, models, and datasets publicly available.

ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models

TL;DR

This work successfully pre-train ByGPT5, a new token-free decoder-only language model, and fine-tune it on a large custom corpus of English and German quatrains annotated with the authors' styles, and shows that ByG PT5 outperforms other models such as mT5, ByT4, GPT-2 and ChatGPT, while also being more parameter efficient and performing favorably compared to humans.

Abstract

State-of-the-art poetry generation systems are often complex. They either consist of task-specific model pipelines, incorporate prior knowledge in the form of manually created constraints, or both. In contrast, end-to-end models would not suffer from the overhead of having to model prior knowledge and could learn the nuances of poetry from data alone, reducing the degree of human supervision required. In this work, we investigate end-to-end poetry generation conditioned on styles such as rhyme, meter, and alliteration. We identify and address lack of training data and mismatching tokenization algorithms as possible limitations of past attempts. In particular, we successfully pre-train ByGPT5, a new token-free decoder-only language model, and fine-tune it on a large custom corpus of English and German quatrains annotated with our styles. We show that ByGPT5 outperforms other models such as mT5, ByT5, GPT-2 and ChatGPT, while also being more parameter efficient and performing favorably compared to humans. In addition, we analyze its runtime performance and demonstrate that it is not prone to memorization. We make our code, models, and datasets publicly available.
Paper Structure (26 sections, 1 equation, 9 figures, 7 tables)

This paper contains 26 sections, 1 equation, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Generated quatrain with ABBA rhyme scheme, high amount of alliterations (green), and iambic meter, i.e., unstressed syllable (*u) follows stressed syllable (*_).
  • Figure 2: Perplexity on the training data when pre-training for English and German.
  • Figure 3: Automatic evaluation results for all models on English and German.
  • Figure 4: Distributions of BWS scores for rhyme, human likeness, and meter annotations through kernel density estimation. Scores range from -1 (very bad) to 1 (very good). The "•" markers denote expected values.
  • Figure 5: Automatic evaluation of low-resource models.
  • ...and 4 more figures