Table of Contents
Fetching ...

Prior Lessons of Incremental Dialogue and Robot Action Management for the Age of Language Models

Casey Kennington, Pierre Lison, David Schlangen

TL;DR

It is found that there is very little research on incremental dialogue management, offer some requirements for practical incremental dialogue management, and implications of incremental dialogue for embodied, robotic platforms in the age of large language models are found.

Abstract

Efforts towards endowing robots with the ability to speak have benefited from recent advancements in natural language processing, in particular large language models. However, current language models are not fully incremental, as their processing is inherently monotonic and thus lack the ability to revise their interpretations or output in light of newer observations. This monotonicity has important implications for the development of dialogue systems for human--robot interaction. In this paper, we review the literature on interactive systems that operate incrementally (i.e., at the word level or below it). We motivate the need for incremental systems, survey incremental modeling of important aspects of dialogue like speech recognition and language generation. Primary focus is on the part of the system that makes decisions, known as the dialogue manager. We find that there is very little research on incremental dialogue management, offer some requirements for practical incremental dialogue management, and the implications of incremental dialogue for embodied, robotic platforms in the age of large language models.

Prior Lessons of Incremental Dialogue and Robot Action Management for the Age of Language Models

TL;DR

It is found that there is very little research on incremental dialogue management, offer some requirements for practical incremental dialogue management, and implications of incremental dialogue for embodied, robotic platforms in the age of large language models are found.

Abstract

Efforts towards endowing robots with the ability to speak have benefited from recent advancements in natural language processing, in particular large language models. However, current language models are not fully incremental, as their processing is inherently monotonic and thus lack the ability to revise their interpretations or output in light of newer observations. This monotonicity has important implications for the development of dialogue systems for human--robot interaction. In this paper, we review the literature on interactive systems that operate incrementally (i.e., at the word level or below it). We motivate the need for incremental systems, survey incremental modeling of important aspects of dialogue like speech recognition and language generation. Primary focus is on the part of the system that makes decisions, known as the dialogue manager. We find that there is very little research on incremental dialogue management, offer some requirements for practical incremental dialogue management, and the implications of incremental dialogue for embodied, robotic platforms in the age of large language models.
Paper Structure (25 sections, 4 figures)

This paper contains 25 sections, 4 figures.

Figures (4)

  • Figure 1: Traditional architecture for spoken dialogue systems composed of Automatic Speech Recognition (ASR), Natural Langauge Understanding (NLU), Dialogue Management (DM), Natural Language Generation (NLG), and Text-to-Speech Synthesis (TTS).
  • Figure 2: From Kennington2021-fm, an example of Pointer, Word, POS, and SEM iu annotations for a sample from the Localized Narrative dataset. Solid lines denote slls, dashed denote grins, and the dotted lines denote an alignment between two modalities. Image taken from https://google.github.io/localized-narratives.
  • Figure 3: From Pincus2017-oh, an example of human-human and game intelligent update dialogues with barge-in.
  • Figure 4: From Zilka2015-ul, a schematic of a LSTM-based dialogue state tracker.