Table of Contents
Fetching ...

Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming

Tommaso Pasini, Alejo López-Ávila, Husam Quteineh, Gerasimos Lampouras, Jinhua Du, Yubing Wang, Ze Li, Yusen Sun

TL;DR

This work tackles the challenge of generating lyrics that adhere to user-defined rhyming constraints while leveraging pretrained encoder-decoder models. It introduces a last-word-first fine-tuning strategy that prepends the rhyming word to each lyric segment, enabling rhyme decisions before content generation yet preserving left-to-right generation compatible with standard PLMs. The authors present Plain and Multitask variants, accompany them with datasets in English and 12 additional languages, and provide extensive experiments showing improved rhyming precision and readability over baselines. Multilingual evaluation reveals language-specific challenges, with English performing best, and human evaluation indicates the approach yields lyrics that are increasingly meaningful and grammatically correct, approaching human quality in some settings. The work advances controllable, high-quality rhyming lyric generation and supplies data, metrics, and insights for further cross-language research and bias-aware generation.

Abstract

Composing poetry or lyrics involves several creative factors, but a challenging aspect of generation is the adherence to a more or less strict metric and rhyming pattern. To address this challenge specifically, previous work on the task has mainly focused on reverse language modeling, which brings the critical selection of each rhyming word to the forefront of each verse. On the other hand, reversing the word order requires that models be trained from scratch with this task-specific goal and cannot take advantage of transfer learning from a Pretrained Language Model (PLM). We propose a novel fine-tuning approach that prepends the rhyming word at the start of each lyric, which allows the critical rhyming decision to be made before the model commits to the content of the lyric (as during reverse language modeling), but maintains compatibility with the word order of regular PLMs as the lyric itself is still generated in left-to-right order. We conducted extensive experiments to compare this fine-tuning against the current state-of-the-art strategies for rhyming, finding that our approach generates more readable text and better rhyming capabilities. Furthermore, we furnish a high-quality dataset in English and 12 other languages, analyse the approach's feasibility in a multilingual context, provide extensive experimental results shedding light on good and bad practices for lyrics generation, and propose metrics to compare methods in the future.

Encoder-Decoder Framework for Interactive Free Verses with Generation with Controllable High-Quality Rhyming

TL;DR

This work tackles the challenge of generating lyrics that adhere to user-defined rhyming constraints while leveraging pretrained encoder-decoder models. It introduces a last-word-first fine-tuning strategy that prepends the rhyming word to each lyric segment, enabling rhyme decisions before content generation yet preserving left-to-right generation compatible with standard PLMs. The authors present Plain and Multitask variants, accompany them with datasets in English and 12 additional languages, and provide extensive experiments showing improved rhyming precision and readability over baselines. Multilingual evaluation reveals language-specific challenges, with English performing best, and human evaluation indicates the approach yields lyrics that are increasingly meaningful and grammatically correct, approaching human quality in some settings. The work advances controllable, high-quality rhyming lyric generation and supplies data, metrics, and insights for further cross-language research and bias-aware generation.

Abstract

Composing poetry or lyrics involves several creative factors, but a challenging aspect of generation is the adherence to a more or less strict metric and rhyming pattern. To address this challenge specifically, previous work on the task has mainly focused on reverse language modeling, which brings the critical selection of each rhyming word to the forefront of each verse. On the other hand, reversing the word order requires that models be trained from scratch with this task-specific goal and cannot take advantage of transfer learning from a Pretrained Language Model (PLM). We propose a novel fine-tuning approach that prepends the rhyming word at the start of each lyric, which allows the critical rhyming decision to be made before the model commits to the content of the lyric (as during reverse language modeling), but maintains compatibility with the word order of regular PLMs as the lyric itself is still generated in left-to-right order. We conducted extensive experiments to compare this fine-tuning against the current state-of-the-art strategies for rhyming, finding that our approach generates more readable text and better rhyming capabilities. Furthermore, we furnish a high-quality dataset in English and 12 other languages, analyse the approach's feasibility in a multilingual context, provide extensive experimental results shedding light on good and bad practices for lyrics generation, and propose metrics to compare methods in the future.
Paper Structure (20 sections, 3 equations, 2 figures, 6 tables)

This paper contains 20 sections, 3 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Drawing of the training procedure adopted to inject rhyming knowledge within an encoder-decoder model.
  • Figure 2: Drawing of the multitask training procedure.