From "um" to "yeah": Producing, predicting, and regulating information flow in human conversation
Claire Augusta Bergey, Simon DeDeo
TL;DR
This work investigates how information flows in natural conversation by estimating surprisal-based information density from a large-scale naturalistic corpus (CANDOR) using long-context next-word predictions, yielding an average rate of $13.21$ bits/second and $4.04$ bits/word. It demonstrates that high-surprise words are produced more slowly due to both lexical-length and context-driven mechanisms, and that listeners’ backchannels regulate information flow by signaling upcoming shifts to novel material. The study also shows that retrieval and production impose distinct cognitive costs, with disfluencies acting as time-buying devices that predict increased upcoming surprisal. Overall, the findings support resource-limited models of conversational dynamics while highlighting the need for multiscale, non-Shannon-based frameworks to fully capture the variability and coordination in human dialogue.
Abstract
Conversation demands attention. Speakers must call words to mind, listeners must make sense of them, and both together must negotiate this flow of information, all in fractions of a second. We used large language models to study how this works in a large-scale dataset of English-language conversation, the CANDOR corpus. We provide a new estimate of the information density of unstructured conversation, of approximately 13 bits/second, and find significant effects associated with the cognitive load of both retrieving, and presenting, that information. We also reveal a role for backchannels -- the brief yeahs, uh-huhs, and mhmms that listeners provide -- in regulating the production of novelty: the lead-up to a backchannel is associated with declining information rate, while speech downstream rebounds to previous rates. Our results provide new insights into long-standing theories of how we respond to fluctuating demands on cognitive resources, and how we negotiate those demands in partnership with others.
