Table of Contents
Fetching ...

Learning from Implicit User Feedback, Emotions and Demographic Information in Task-Oriented and Document-Grounded Dialogues

Dominic Petrak, Thy Thy Tran, Iryna Gurevych

TL;DR

FEDI, the first English task-oriented and document-grounded dialogue dataset annotated with implicit user feedback, user emotions and demographic information, is introduced, showing a particularly positive impact on task completion and factual consistency.

Abstract

Implicit user feedback, user emotions and demographic information have shown to be promising sources for improving the accuracy and user engagement of responses generated by dialogue systems. However, the influence of such information on task completion and factual consistency, which are important criteria for task-oriented and document-grounded dialogues, is not yet known. To address this, we introduce FEDI, the first English task-oriented and document-grounded dialogue dataset annotated with this information. Our experiments with Flan-T5, GPT-2 and Llama 2 show a particularly positive impact on task completion and factual consistency. Participants in our human evaluation reported that the responses generated by the feedback-trained models were more informative (Flan-T5 and GPT-2), relevant and factual consistent (Llama 2).

Learning from Implicit User Feedback, Emotions and Demographic Information in Task-Oriented and Document-Grounded Dialogues

TL;DR

FEDI, the first English task-oriented and document-grounded dialogue dataset annotated with implicit user feedback, user emotions and demographic information, is introduced, showing a particularly positive impact on task completion and factual consistency.

Abstract

Implicit user feedback, user emotions and demographic information have shown to be promising sources for improving the accuracy and user engagement of responses generated by dialogue systems. However, the influence of such information on task completion and factual consistency, which are important criteria for task-oriented and document-grounded dialogues, is not yet known. To address this, we introduce FEDI, the first English task-oriented and document-grounded dialogue dataset annotated with this information. Our experiments with Flan-T5, GPT-2 and Llama 2 show a particularly positive impact on task completion and factual consistency. Participants in our human evaluation reported that the responses generated by the feedback-trained models were more informative (Flan-T5 and GPT-2), relevant and factual consistent (Llama 2).
Paper Structure (69 sections, 1 equation, 34 figures, 17 tables)

This paper contains 69 sections, 1 equation, 34 figures, 17 tables.

Figures (34)

  • Figure 1: A feedback dialogue from FEDI, annotated with user emotions and implicit user feedback (generation error and user feedback types).
  • Figure 2: Overview of our framework for generating and annotating dialogues. *B (the green arrow) symbolizes GPT-3.5. The generation of feedback dialogues requires feedback scenarios as additional source. For question answering dialogues, we include the respective documents in the task description.
  • Figure 3: Feedback dialogue generation. Each version solves one of the feedback scenarios from Version 1. See Appendix \ref{['appendix:prompts']} (Figure \ref{['fig:feedback_example']}) for an example dialogue.
  • Figure 4: Ratio of the most commonly observed user emotions in FEDI (excluding the Neutral emotion).
  • Figure 5: Distribution of user feedback types in relation to generation error types in feedback scenarios.
  • ...and 29 more figures