Table of Contents
Fetching ...

Analysing Cross-Speaker Convergence in Face-to-Face Dialogue through the Lens of Automatically Detected Shared Linguistic Constructions

Esam Ghaleb, Marlou Rasenberg, Wim Pouw, Ivan Toni, Judith Holler, Aslı Özyürek, Raquel Fernández

TL;DR

The paper tackles how cross-speaker alignment during referential communication relates to convergence on object labels. It introduces an automated method to detect shared lemmatised constructions that refer to the same referent, applied to 66 Dutch-speaking dyads performing a six-round director-matcher task with novel objects (fribbles) and pre-/post-task naming. Analyses reveal pervasive alignment, but also show that having many construction types can hinder convergence, while a dominant, frequent construction used toward the end of interaction fosters post-interaction naming similarity. The approach provides a scalable lens on referential negotiation and language conventionalisation, with implications for understanding how common ground emerges in dialogue and for building more grounded conversational systems.

Abstract

Conversation requires a substantial amount of coordination between dialogue participants, from managing turn taking to negotiating mutual understanding. Part of this coordination effort surfaces as the reuse of linguistic behaviour across speakers, a process often referred to as alignment. While the presence of linguistic alignment is well documented in the literature, several questions remain open, including the extent to which patterns of reuse across speakers have an impact on the emergence of labelling conventions for novel referents. In this study, we put forward a methodology for automatically detecting shared lemmatised constructions -- expressions with a common lexical core used by both speakers within a dialogue -- and apply it to a referential communication corpus where participants aim to identify novel objects for which no established labels exist. Our analyses uncover the usage patterns of shared constructions in interaction and reveal that features such as their frequency and the amount of different constructions used for a referent are associated with the degree of object labelling convergence the participants exhibit after social interaction. More generally, the present study shows that automatically detected shared constructions offer a useful level of analysis to investigate the dynamics of reference negotiation in dialogue.

Analysing Cross-Speaker Convergence in Face-to-Face Dialogue through the Lens of Automatically Detected Shared Linguistic Constructions

TL;DR

The paper tackles how cross-speaker alignment during referential communication relates to convergence on object labels. It introduces an automated method to detect shared lemmatised constructions that refer to the same referent, applied to 66 Dutch-speaking dyads performing a six-round director-matcher task with novel objects (fribbles) and pre-/post-task naming. Analyses reveal pervasive alignment, but also show that having many construction types can hinder convergence, while a dominant, frequent construction used toward the end of interaction fosters post-interaction naming similarity. The approach provides a scalable lens on referential negotiation and language conventionalisation, with implications for understanding how common ground emerges in dialogue and for building more grounded conversational systems.

Abstract

Conversation requires a substantial amount of coordination between dialogue participants, from managing turn taking to negotiating mutual understanding. Part of this coordination effort surfaces as the reuse of linguistic behaviour across speakers, a process often referred to as alignment. While the presence of linguistic alignment is well documented in the literature, several questions remain open, including the extent to which patterns of reuse across speakers have an impact on the emergence of labelling conventions for novel referents. In this study, we put forward a methodology for automatically detecting shared lemmatised constructions -- expressions with a common lexical core used by both speakers within a dialogue -- and apply it to a referential communication corpus where participants aim to identify novel objects for which no established labels exist. Our analyses uncover the usage patterns of shared constructions in interaction and reveal that features such as their frequency and the amount of different constructions used for a referent are associated with the degree of object labelling convergence the participants exhibit after social interaction. More generally, the present study shows that automatically detected shared constructions offer a useful level of analysis to investigate the dynamics of reference negotiation in dialogue.
Paper Structure (15 sections, 5 figures)

This paper contains 15 sections, 5 figures.

Figures (5)

  • Figure 1: Four example fribbles used as stimuli in the task.
  • Figure 2: This figure shows the pre- and post-interaction names and the shared construction types for a fribble by a pair. Before the interaction, speakers A and B refer to the fribble as "pinocchio science art" and "diamond bar on top," respectively. The figure shows the shared constructions that emerge, with the arrows indicating the order in which speakers repeat these constructions. For instance, in the first round, speaker A, acting as the director, refers to the fribble as "pinocchio" twice, and speaker B repeats this construction in the second round. This dyad uses three shared construction types for this fribble (indicated by the colours purple, orange, and green). The types "book" and "pinocchio" are dropped after Round 3, while the type "boiler" is used in all rounds (a total of 11 times) as well as in the post-interaction names by both speakers.
  • Figure 3: Percentage of utterances containing shared constructions over the rounds of interaction.
  • Figure 4: Lexical cosine similarity between a speaker's pre- and post-interaction names and the shared constructions used for a fribble, over the six dialogue rounds.
  • Figure 5: Correlation between the number of shared construction types per object and the cosine similarity between the post-interaction names of the two participants in each dyad.