Table of Contents
Fetching ...

Evaluation of a semi-autonomous attentive listening system with takeover prompting

Haruki Kawai, Divesh Lala, Koji Inoue, Keiko Ochi, Tatsuya Kawahara

TL;DR

The paper addresses engagement decline in attentive listening dialogue systems by introducing a semi-autonomous setup where a remote operator can takeover in real time. It implements explicit takeover prompts based on engaging cues and compares semi-autonomous, fully autonomous, and fully tele-operated conditions in Japanese via MMDAgent. Results indicate the semi-autonomous system improves empathy and user interest over the autonomous baseline, with takeovers generally perceived positively and not strongly tied to frequency. The findings suggest that identifying and leveraging optimal takeover points can inform improvements to autonomous dialogue policies and scalability to multi-conversation settings.

Abstract

The handling of communication breakdowns and loss of engagement is an important aspect of spoken dialogue systems, particularly for chatting systems such as attentive listening, where the user is mostly speaking. We presume that a human is best equipped to handle this task and rescue the flow of conversation. To this end, we propose a semi-autonomous system, where a remote operator can take control of an autonomous attentive listening system in real-time. In order to make human intervention easy and consistent, we introduce automatic detection of low interest and engagement to provide explicit takeover prompts to the remote operator. We implement this semi-autonomous system which detects takeover points for the operator and compare it to fully tele-operated and fully autonomous attentive listening systems. We find that the semi-autonomous system is generally perceived more positively than the autonomous system. The results suggest that identifying points of conversation when the user starts to lose interest may help us improve a fully autonomous dialogue system.

Evaluation of a semi-autonomous attentive listening system with takeover prompting

TL;DR

The paper addresses engagement decline in attentive listening dialogue systems by introducing a semi-autonomous setup where a remote operator can takeover in real time. It implements explicit takeover prompts based on engaging cues and compares semi-autonomous, fully autonomous, and fully tele-operated conditions in Japanese via MMDAgent. Results indicate the semi-autonomous system improves empathy and user interest over the autonomous baseline, with takeovers generally perceived positively and not strongly tied to frequency. The findings suggest that identifying and leveraging optimal takeover points can inform improvements to autonomous dialogue policies and scalability to multi-conversation settings.

Abstract

The handling of communication breakdowns and loss of engagement is an important aspect of spoken dialogue systems, particularly for chatting systems such as attentive listening, where the user is mostly speaking. We presume that a human is best equipped to handle this task and rescue the flow of conversation. To this end, we propose a semi-autonomous system, where a remote operator can take control of an autonomous attentive listening system in real-time. In order to make human intervention easy and consistent, we introduce automatic detection of low interest and engagement to provide explicit takeover prompts to the remote operator. We implement this semi-autonomous system which detects takeover points for the operator and compare it to fully tele-operated and fully autonomous attentive listening systems. We find that the semi-autonomous system is generally perceived more positively than the autonomous system. The results suggest that identifying points of conversation when the user starts to lose interest may help us improve a fully autonomous dialogue system.
Paper Structure (10 sections, 5 figures, 2 tables)

This paper contains 10 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: General concept of the semi-autonomous system. The top half shows the dialogue system is in agent control, where it is autonomous and being monitored by a remote operator on the right hand side. If the system detects that there is a problem with the conversation then it informs the operator through the GUI who switches to operator control. The operator can then take over and speak through the agent. Switching between agent and operator control is done by the operator.
  • Figure 2: The virtual agent used in the work, executing a "happy" expression.
  • Figure 3: The interface seen by the remote operator before takeover (left) and after takeover (right). Note that a message appears when the system detects that a takeover is necessary. Once the operator takes over, the microphone button is highlighted to show that they can directly speak to the user.
  • Figure 4: Average scores for each measure across the three conditions. Sample points are given for each of the 20 subjects per condition.
  • Figure 5: Average scores for each measure across the three conditions. Sample points are given for each of the 20 subjects per condition.