Let the Poem Hit the Rhythm: Using a Byte-Based Transformer for Beat-Aligned Poetry Generation
Mohamad Elzohbi, Richard Zhao
TL;DR
This work investigates beat-aligned poetry generation by training a byte-based transformer, ByT5, to insert or replace words so that poetry verses $S$ conform to a target beat pattern $B(W)$ without losing semantic coherence. It derives beat patterns from grapheme-to-phoneme mappings into consonant-vowel structures, encoding rhythm as patterns of vowel onsets, and trains a masked-span objective with span-masking to condition generation on these rhythms. The study demonstrates that ByT5 variants achieve high alignment accuracy (≈$98.3$–$98.9\%$ exact; ≈$99.6$–$99.8\%$ Levenshtein) while maintaining coherence comparable to baselines, and outperforms a GPT-4 baseline on the same task. This suggests practical potential for co-creative rhythmic poetry and beat-aware lyric generation, with future work extending to full verse generation and human evaluations to validate rhythmic fidelity.
Abstract
The intersection between poetry and music provides an interesting case for computational creativity, yet remains relatively unexplored. This paper explores the integration of poetry and music through the lens of beat patterns, investigating whether a byte-based language model can generate words that fit specific beat patterns within the context of poetry. Drawing on earlier studies, we developed a method to train a byte-based transformer model, ByT5, to align poems with beat patterns. The results demonstrate a high level of beat alignment while maintaining semantic coherence. Future work will aim to improve the model's ability to create complete beat-aligned poems.
