ART: The Alternating Reading Task Corpus for Speech Entrainment and Imitation
Zheng Yuan, Dorina de Jong, Štefan Beňuš, Noël Nguyen, Ruitao Feng, Róbert Sabo, Luciano Fadiga, Alessandro D`Ausilio
TL;DR
The ART Corpus addresses how acoustic-prosodic entrainment and imitation manifest in L2-L2 speech interactions under controlled reading conditions. It introduces a multi-condition, multilingual English-subcorpora dataset (solo, alternating, imitation) with time-aligned transcripts, English proficiency scores, demographics, and post-experiment questionnaires, enabling rigorous analyses of how proficiency and social factors modulate entrainment. Initial analyses quantify global proximity via inner-speaker and inner-dyad distances across eight prosodic features, revealing robust entrainment trends from solo to imitation and subcorpus-dependent variability, with Max Pitch and Shimmer emerging as consistent indicators. The resource supports replicable investigation of entrainment mechanisms and has potential applications in language education, speech technology, and human–machine interaction, while offering avenues for future expansion and richer perceptual ratings.
Abstract
We introduce the Alternating Reading Task (ART) Corpus, a collection of dyadic sentence reading for studying the entrainment and imitation behaviour in speech communication. The ART corpus features three experimental conditions - solo reading, alternating reading, and deliberate imitation - as well as three sub-corpora encompassing French-, Italian-, and Slovak-accented English. This design allows systematic investigation of speech entrainment in a controlled and less-spontaneous setting. Alongside detailed transcriptions, it includes English proficiency scores, demographics, and in-experiment questionnaires for probing linguistic, personal and interpersonal influences on entrainment. Our presentation covers its design, collection, annotation processes, initial analysis, and future research prospects.
