A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere In Between?

Kahyun Choi; Minje Kim

A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere In Between?

Kahyun Choi, Minje Kim

TL;DR

This work addresses the gap in understanding the acoustic characteristics of poetry reading by placing it on a spectrum between narrative speech and singing. It proposes a scalable signal-processing pipeline that analyzes silence patterns, local pitch variability, and beat stability across three large corpora, using WhisperX preprocessing, $pYIN$ pitch estimation, and a dynamic-programming beat-tracker with the objective $C(\{t_i\\})=\sum_{i=1}^N O(t_i) + \alpha \sum_{i=2}^N F(t_i - t_{i-1}, \tau_p)$ under regularization settings $\alpha \in \{1,1000\}$. Analysis of the Poetry Foundation poetry readings, LibriSpeech narration, and Intonation singing shows that poetry reading exhibits intermediate characteristics, sharing musical traits with singing while retaining narrative-like pitch variation; beat patterns are present but not as rigid as in singing. The findings provide a quantitative bridge between speech and music domains and underscore the value of open-source tools for reproducible, large-scale poetry-audio research.

Abstract

This paper provides a computational analysis of poetry reading audio signals at a large scale to unveil the musicality within professionally-read poems. Although the acoustic characteristics of other types of spoken language have been extensively studied, most of the literature is limited to narrative speech or singing voice, discussing how different they are from each other. In this work, we develop signal processing methods, which are tailored to capture the unique acoustic characteristics of poetry reading based on their silence patterns, temporal variations of local pitch, and beat stability. Our large-scale statistical analyses on three big corpora, each of which consists of narration (LibriSpeech), singing voice (Intonation), and poetry reading (from The Poetry Foundation), discover that poetry reading does share some musical characteristics with singing voice, although it may also resemble narrative speech.

A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere In Between?

TL;DR

pitch estimation, and a dynamic-programming beat-tracker with the objective

under regularization settings

. Analysis of the Poetry Foundation poetry readings, LibriSpeech narration, and Intonation singing shows that poetry reading exhibits intermediate characteristics, sharing musical traits with singing while retaining narrative-like pitch variation; beat patterns are present but not as rigid as in singing. The findings provide a quantitative bridge between speech and music domains and underscore the value of open-source tools for reproducible, large-scale poetry-audio research.

Abstract

Paper Structure (11 sections, 2 equations, 4 figures, 1 table)

This paper contains 11 sections, 2 equations, 4 figures, 1 table.

Introduction
Methodology
Preprocessing via Transcription
Local Pitch Variability
Beat Stability
Datasets
Experiments
Silence Patterns
Local Pitch Variability
Beat Stability
Conclusion

Figures (4)

Figure 1: Histograms of the (a) short (b) medium (c) long silent segments.
Figure 2: Pitch contours (top) and local standard deviation (bottom).
Figure 3: Histograms of the std values of the local pitch contours.
Figure 4: Histograms of the beat tracking scores.

A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere In Between?

TL;DR

Abstract

A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere In Between?

Authors

TL;DR

Abstract

Table of Contents

Figures (4)