Multi-Document Grounded Multi-Turn Synthetic Dialog Generation

Young-Suk Lee; Chulaka Gunasekara; Danish Contractor; Ramón Fernandez Astudillo; Radu Florian

Multi-Document Grounded Multi-Turn Synthetic Dialog Generation

Young-Suk Lee, Chulaka Gunasekara, Danish Contractor, Ramón Fernandez Astudillo, Radu Florian

TL;DR

Both human and automatic evaluations of answerable queries indicate that models fine-tuned on synthetic dialogs consistently out-perform those fine-tuned on existing human generated training data across four publicly available multi-turn document grounded benchmark test sets.

Abstract

We introduce a technique for multi-document grounded multi-turn synthetic dialog generation that incorporates three main ideas. First, we control the overall dialog flow using taxonomy-driven user queries that are generated with Chain-of-Thought (CoT) prompting. Second, we support the generation of multi-document grounded dialogs by mimicking real-world use of retrievers to update the grounding documents after every user-turn in the dialog. Third, we apply LLM-as-a-Judge to filter out queries with incorrect answers. Human evaluation of the synthetic dialog data suggests that the data is diverse, coherent, and includes mostly correct answers. Both human and automatic evaluations of answerable queries indicate that models fine-tuned on synthetic dialogs consistently out-perform those fine-tuned on existing human generated training data across four publicly available multi-turn document grounded benchmark test sets.

Multi-Document Grounded Multi-Turn Synthetic Dialog Generation

TL;DR

Abstract

Paper Structure (18 sections, 11 figures, 10 tables, 2 algorithms)

This paper contains 18 sections, 11 figures, 10 tables, 2 algorithms.

Introduction
Multi-turn Dialog Generation
Question Taxonomy
Query generation with CoT Prompting
Response generation with CoT Prompting
Dialog Generation
Single-document grounded dialog
Multi-document grounded dialog
Retriever
LLM-as-a-Judge for Correctness
Data Quality Evaluation
Experimental Results
Automatic Evaluations
Human Evaluation
Multi-Document Grounded Dialogs
...and 3 more sections

Figures (11)

Figure 1: Overview of document grounded multi-turn synthetic dialog generation pipeline. We distinguish two types of dialog style, single-document grounded (light green color boxes) and multi-document grounded (retrieval augmented generation, pink color boxes). Both styles share the same starting-turn query taxonomy (ST-QT), CoT prompt for the initial query (Query$_{1}$) generation, multi-turn query taxonomy (MT-QT) and the CoT prompt for second turn query (Query$_{2}$) generation. User queries and agent answers are generated by an LLM. Given the initial query generated from the same single document, single-document grounded dialog generation proceeds according to Algorithm \ref{['alg:mrc-dialog-algorithm']} and multi-document-grounded, according to Algorithm \ref{['alg:rag-dialog-algorithm']} in §\ref{['sec:dialog-flow']}. After generating multi-turn dialogs, we apply LLM-as-a-Judge to filter out queries with incorrect answers turn-by-turn. We use Mixtral-8x7b-instruct as our language model for both data generation and LLM-as-a-Judge.
Figure 2: Winrate of two sets of fine-tuned models on the test sets judged by human annotators.
Figure 3: Chain-of-Thought prompt used for generating direct questions
Figure 5: Chain-of-Thought prompt used for generating aggregate questions
Figure 7: Chain-of-Thought prompt used for generating conversations with follow-up questions
...and 6 more figures

Multi-Document Grounded Multi-Turn Synthetic Dialog Generation

TL;DR

Abstract

Multi-Document Grounded Multi-Turn Synthetic Dialog Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (11)