Creating, Using and Assessing a Generative-AI-Based Human-Chatbot-Dialogue Dataset with User-Interaction Learning Capabilities

Alfredo Cuzzocrea; Giovanni Pilato; Pablo Garcia Bringas

Creating, Using and Assessing a Generative-AI-Based Human-Chatbot-Dialogue Dataset with User-Interaction Learning Capabilities

Alfredo Cuzzocrea, Giovanni Pilato, Pablo Garcia Bringas

TL;DR

The paper addresses the need for emotion-aware dialogue datasets in customer-service contexts by generating synthetic conversations with ChatGPT-3.5 conditioned on target emotions and CEFR language levels ($A2$, $B2$, $C2$). It introduces a pipeline that yields per-turn emotional labels and CEFR annotations, including both Explicit and Implicit Emotion Dialogues, and applies quality control via a Quality of Interaction (QoI) metric alongside ARTE-based readability assessments. A set of experiments demonstrates emotionally coherent turn sequences across anger and surprise at multiple CEFR levels, with detailed readability analyses confirming alignment to the specified language complexity. The resulting dataset and analytics framework offer a scalable resource for training and evaluating emotion-aware, adaptive dialogue systems in customer support and related HCI domains.

Abstract

The study illustrates a first step towards an ongoing work aimed at developing a dataset of dialogues potentially useful for customer service conversation management between humans and AI chatbots. The approach exploits ChatGPT 3.5 to generate dialogues. One of the requirements is that the dialogue is characterized by a specific language proficiency level of the user; the other one is that the user expresses a specific emotion during the interaction. The generated dialogues were then evaluated for overall quality. The complexity of the language used by both humans and AI agents, has been evaluated by using standard complexity measurements. Furthermore, the attitudes and interaction patterns exhibited by the chatbot at each turn have been stored for further detection of common conversation patterns in specific emotional contexts. The methodology could improve human-AI dialogue effectiveness and serve as a basis for systems that can learn from user interactions.

Creating, Using and Assessing a Generative-AI-Based Human-Chatbot-Dialogue Dataset with User-Interaction Learning Capabilities

TL;DR

). It introduces a pipeline that yields per-turn emotional labels and CEFR annotations, including both Explicit and Implicit Emotion Dialogues, and applies quality control via a Quality of Interaction (QoI) metric alongside ARTE-based readability assessments. A set of experiments demonstrates emotionally coherent turn sequences across anger and surprise at multiple CEFR levels, with detailed readability analyses confirming alignment to the specified language complexity. The resulting dataset and analytics framework offer a scalable resource for training and evaluating emotion-aware, adaptive dialogue systems in customer support and related HCI domains.

Abstract

Paper Structure (11 sections, 17 figures)

This paper contains 11 sections, 17 figures.

Introduction
The Proposed Approach
Experimental Results
Anger and A2 CEFR Language Level
Anger and B2 CEFR Language Level
Anger and C2 CEFR Language Level
Surprise and A2 CEFR Language Level
Surprise and B2 CEFR Language Level
Surprise and C2 CEFR Language Level
Readability Results
Conclusions and Future work

Figures (17)

Figure 1: The overall schema of the proposed approach
Figure 2: The ARI average readability results for the A2, B2, and C2 CEFR levels both for the User and the Agent
Figure 3: The CAREC average readability results for the A2, B2, and C2 CEFR levels both for the User and the Agent
Figure 4: The CARECM average readability results for the A2, B2, and C2 CEFR levels both for the User and the Agent
Figure 5: The CML2 average readability results for the A2, B2, and C2 CEFR levels both for the User and the Agent
...and 12 more figures

Creating, Using and Assessing a Generative-AI-Based Human-Chatbot-Dialogue Dataset with User-Interaction Learning Capabilities

TL;DR

Abstract

Creating, Using and Assessing a Generative-AI-Based Human-Chatbot-Dialogue Dataset with User-Interaction Learning Capabilities

Authors

TL;DR

Abstract

Table of Contents

Figures (17)