Table of Contents
Fetching ...

Automated Evaluation of Meter and Rhyme in Russian Generative and Human-Authored Poetry

Ilya Koziev

TL;DR

This work presents RPST, an open-source tool for automatic stress placement, meter detection, and rhyme analysis in Russian syllabo-tonic poetry, along with RIFMA, a large annotated dataset of Russian poetry fragments for benchmarking LLMs and data engineering. Through human evaluations and side-by-side comparisons, the authors demonstrate that RPST-derived metrics correlate strongly with human judgments ($\,r = 0.79\$, $p = 3.2 \times 10^{-181}$) and can improve data quality and model ranking in poetry generation. The study also reports large-scale analyses of amateur poetry to illustrate the prevalence of meter patterns and the challenges of achieving high technicality across lines. Limitations include meter coverage and POS-analysis capabilities, with future work aimed at multilingual extension and broader dataset coverage, under MIT licensing to promote open research.

Abstract

Generative poetry systems require effective tools for data engineering and automatic evaluation, particularly to assess how well a poem adheres to versification rules, such as the correct alternation of stressed and unstressed syllables and the presence of rhymes. In this work, we introduce the Russian Poetry Scansion Tool library designed for stress mark placement in Russian-language syllabo-tonic poetry, rhyme detection, and identification of defects of poeticness. Additionally, we release RIFMA -- a dataset of poem fragments spanning various genres and forms, annotated with stress marks. This dataset can be used to evaluate the capability of modern large language models to accurately place stress marks in poetic texts. The published resources provide valuable tools for researchers and practitioners in the field of creative generative AI, facilitating advancements in the development and evaluation of generative poetry systems.

Automated Evaluation of Meter and Rhyme in Russian Generative and Human-Authored Poetry

TL;DR

This work presents RPST, an open-source tool for automatic stress placement, meter detection, and rhyme analysis in Russian syllabo-tonic poetry, along with RIFMA, a large annotated dataset of Russian poetry fragments for benchmarking LLMs and data engineering. Through human evaluations and side-by-side comparisons, the authors demonstrate that RPST-derived metrics correlate strongly with human judgments (, ) and can improve data quality and model ranking in poetry generation. The study also reports large-scale analyses of amateur poetry to illustrate the prevalence of meter patterns and the challenges of achieving high technicality across lines. Limitations include meter coverage and POS-analysis capabilities, with future work aimed at multilingual extension and broader dataset coverage, under MIT licensing to promote open research.

Abstract

Generative poetry systems require effective tools for data engineering and automatic evaluation, particularly to assess how well a poem adheres to versification rules, such as the correct alternation of stressed and unstressed syllables and the presence of rhymes. In this work, we introduce the Russian Poetry Scansion Tool library designed for stress mark placement in Russian-language syllabo-tonic poetry, rhyme detection, and identification of defects of poeticness. Additionally, we release RIFMA -- a dataset of poem fragments spanning various genres and forms, annotated with stress marks. This dataset can be used to evaluate the capability of modern large language models to accurately place stress marks in poetic texts. The published resources provide valuable tools for researchers and practitioners in the field of creative generative AI, facilitating advancements in the development and evaluation of generative poetry systems.

Paper Structure

This paper contains 11 sections, 1 figure, 7 tables.

Figures (1)

  • Figure 1: Distribution of technicality scores per line for scraped poems.