Table of Contents
Fetching ...

Turn-taking annotation for quantitative and qualitative analyses of conversation

Anneliese Kelterer, Barbara Schuppler

TL;DR

This paper presents a two-layer turn-taking annotation framework (IPU and PCOMP) for GRASS, enabling both qualitative and quantitative analyses of spontaneous conversations in Austrian German. The authors detail the criteria, label inventories, and time-aligned annotation workflow, supported by extensive inter-annotator reliability assessments that show high agreement, especially for IPU boundaries. They demonstrate the utility of the annotations through pilot analyses of conversational dynamics, illustrating how IPU and PCOMP capture distinct structural and prosodic patterns and can reveal cross-disciplinary research questions. The GRASS annotations and methodology are designed to be language-agnostic and broadly applicable to linguistic, psycholinguistic, and speech-technology investigations, thereby fostering cross-fertilization across domains and enabling replication and extension by other researchers.

Abstract

This paper has two goals. First, we present the turn-taking annotation layers created for 95 minutes of conversational speech of the Graz Corpus of Read and Spontaneous Speech (GRASS), available to the scientific community. Second, we describe the annotation system and the annotation process in more detail, so other researchers may use it for their own conversational data. The annotation system was developed with an interdisciplinary application in mind. It should be based on sequential criteria according to Conversation Analysis, suitable for subsequent phonetic analysis, thus time-aligned annotations were made Praat, and it should be suitable for automatic classification, which required the continuous annotation of speech and a label inventory that is not too large and results in a high inter-rater agreement. Turn-taking was annotated on two layers, Inter-Pausal Units (IPU) and points of potential completion (PCOMP; similar to transition relevance places). We provide a detailed description of the annotation process and of segmentation and labelling criteria. A detailed analysis of inter-rater agreement and common confusions shows that agreement for IPU annotation is near-perfect, that agreement for PCOMP annotations is substantial, and that disagreements often are either partial or can be explained by a different analysis of a sequence which also has merit. The annotation system can be applied to a variety of conversational data for linguistic studies and technological applications, and we hope that the annotations, as well as the annotation system will contribute to a stronger cross-fertilization between these disciplines.

Turn-taking annotation for quantitative and qualitative analyses of conversation

TL;DR

This paper presents a two-layer turn-taking annotation framework (IPU and PCOMP) for GRASS, enabling both qualitative and quantitative analyses of spontaneous conversations in Austrian German. The authors detail the criteria, label inventories, and time-aligned annotation workflow, supported by extensive inter-annotator reliability assessments that show high agreement, especially for IPU boundaries. They demonstrate the utility of the annotations through pilot analyses of conversational dynamics, illustrating how IPU and PCOMP capture distinct structural and prosodic patterns and can reveal cross-disciplinary research questions. The GRASS annotations and methodology are designed to be language-agnostic and broadly applicable to linguistic, psycholinguistic, and speech-technology investigations, thereby fostering cross-fertilization across domains and enabling replication and extension by other researchers.

Abstract

This paper has two goals. First, we present the turn-taking annotation layers created for 95 minutes of conversational speech of the Graz Corpus of Read and Spontaneous Speech (GRASS), available to the scientific community. Second, we describe the annotation system and the annotation process in more detail, so other researchers may use it for their own conversational data. The annotation system was developed with an interdisciplinary application in mind. It should be based on sequential criteria according to Conversation Analysis, suitable for subsequent phonetic analysis, thus time-aligned annotations were made Praat, and it should be suitable for automatic classification, which required the continuous annotation of speech and a label inventory that is not too large and results in a high inter-rater agreement. Turn-taking was annotated on two layers, Inter-Pausal Units (IPU) and points of potential completion (PCOMP; similar to transition relevance places). We provide a detailed description of the annotation process and of segmentation and labelling criteria. A detailed analysis of inter-rater agreement and common confusions shows that agreement for IPU annotation is near-perfect, that agreement for PCOMP annotations is substantial, and that disagreements often are either partial or can be explained by a different analysis of a sequence which also has merit. The annotation system can be applied to a variety of conversational data for linguistic studies and technological applications, and we hope that the annotations, as well as the annotation system will contribute to a stronger cross-fertilization between these disciplines.

Paper Structure

This paper contains 22 sections, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Praat screenshot of orthographic, IPU and PCOMP annotations for two speakers (003M: first three rows; 023F: last three rows), illustrating the temporal organization of questions, turn-holds and turn-changes.
  • Figure 2: Decision tree for assigning single IPU labels. A decision tree for assigning IPU single as well as combined labels is presented in Figure \ref{['fig:Flowchart_IPUcombined']} in \ref{['sec:appendixB']}.
  • Figure 3: Praat screenshot of IPU annotations by two annotators (annotator 1: lines 2 and 5; annotator 2: lines 3 and 6), illustrating divergent IPU segmentation and its consequences for labelling. We combined the two annotators' annotations into the same Textgrid only for illustrative purposes, but they did not see each other's annotations in the annotation process.
  • Figure 4: IPU annotations representing the dynamics of five minutes of conversation in 038F039F.
  • Figure 5: IPU annotations representing the dynamics of 100 seconds of conversation in 028F008M.
  • ...and 3 more figures