Table of Contents
Fetching ...

MonoTODia: Translating Monologue Requests to Task-Oriented Dialogues

Sebastian Steindl, Ulrich Schäfer, Bernd Ludwig

TL;DR

MonoTODia tackles TOD data scarcity by translating real-world monologue emails into annotated, multi-turn dialogues using two-phase instruction-tuned LLMs. The pipeline fine-tunes an 8B open-source LLaMA-3.1 model with LoRA for separate dialogue generation and annotation, and validates outputs with crowd workers while enabling downstream TOD training tests. A real-world travel-booking corpus from a German SME is translated to English, cleaned, clustered, and split into train/validation/test, with gold-standard test annotations; a small gold set refines the annotator prior to broader use. Downstream experiments on dialogue state tracking and response generation indicate the synthesized dialogues provide usable signals for TOD training, with larger models delivering stronger performance, and the authors publicly release the dataset to spur future research in low-resource TOD scenarios.

Abstract

Data scarcity is one of the main problems when it comes to real-world applications of transformer-based models. This is especially evident for task-oriented dialogue (TOD) systems, which require specialized datasets, that are usually not readily available. This can hinder companies from adding TOD systems to their services. This study therefore investigates a novel approach to sourcing annotated dialogues from existing German monologue material. Focusing on a real-world example, we investigate whether these monologues can be transformed into dialogue formats suitable for training TOD systems. We show the approach with the concrete example of a company specializing in travel bookings via e-mail. We fine-tune state-of-the-art Large Language Models for the task of rewriting e-mails as dialogues and annotating them. To ensure the quality and validity of the generated data, we employ crowd workers to evaluate the dialogues across multiple criteria and to provide gold-standard annotations for the test dataset. We further evaluate the usefulness of the dialogues for training TOD systems. Our evaluation shows that the dialogues and annotations are of high quality and can serve as a valuable starting point for training TOD systems. Finally, we make the annotated dataset publicly available to foster future research.

MonoTODia: Translating Monologue Requests to Task-Oriented Dialogues

TL;DR

MonoTODia tackles TOD data scarcity by translating real-world monologue emails into annotated, multi-turn dialogues using two-phase instruction-tuned LLMs. The pipeline fine-tunes an 8B open-source LLaMA-3.1 model with LoRA for separate dialogue generation and annotation, and validates outputs with crowd workers while enabling downstream TOD training tests. A real-world travel-booking corpus from a German SME is translated to English, cleaned, clustered, and split into train/validation/test, with gold-standard test annotations; a small gold set refines the annotator prior to broader use. Downstream experiments on dialogue state tracking and response generation indicate the synthesized dialogues provide usable signals for TOD training, with larger models delivering stronger performance, and the authors publicly release the dataset to spur future research in low-resource TOD scenarios.

Abstract

Data scarcity is one of the main problems when it comes to real-world applications of transformer-based models. This is especially evident for task-oriented dialogue (TOD) systems, which require specialized datasets, that are usually not readily available. This can hinder companies from adding TOD systems to their services. This study therefore investigates a novel approach to sourcing annotated dialogues from existing German monologue material. Focusing on a real-world example, we investigate whether these monologues can be transformed into dialogue formats suitable for training TOD systems. We show the approach with the concrete example of a company specializing in travel bookings via e-mail. We fine-tune state-of-the-art Large Language Models for the task of rewriting e-mails as dialogues and annotating them. To ensure the quality and validity of the generated data, we employ crowd workers to evaluate the dialogues across multiple criteria and to provide gold-standard annotations for the test dataset. We further evaluate the usefulness of the dialogues for training TOD systems. Our evaluation shows that the dialogues and annotations are of high quality and can serve as a valuable starting point for training TOD systems. Finally, we make the annotated dataset publicly available to foster future research.

Paper Structure

This paper contains 23 sections, 3 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: The MonoTODia approach. Blue marks e-mails, green annotated dialogues, and red LLMs. Dashed arrows mark inference, dotted arrows training.
  • Figure 2: An example e-mail from the corpus after pre-processing on the left and the resulting annotated dialogue after applying the MonoTODia approach on the right.
  • Figure 3: The clustering of the e-mails used for the train split. We first convert the e-mails with TF-IDF and then encode them with UMAP to build the clusters. It is clear that the short e-mails are the majority.
  • Figure 4: The full prompt used for dialogue generation. Omissions for the sake of brevity are marked in all-caps and bold.
  • Figure 5: The full prompt used for annotation. Omissions for the sake of brevity are marked in all-caps and bold.
  • ...and 2 more figures