Table of Contents
Fetching ...

Book2Dial: Generating Teacher-Student Interactions from Textbooks for Cost-Effective Development of Educational Chatbots

Junling Wang, Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury, Mrinmaya Sachan

TL;DR

This paper tackles the scarcity of high-quality data for educational chatbots by proposing Book2Dial, a framework that generates synthetic teacher-student dialogues grounded in open textbooks. It formalizes a versatile pipeline with three instantiations—multi-turn QG-QA, dialogue inpainting, and persona-based generation—alongside a concrete quality rubric covering relevance, coherence, informativeness, grounding, answerability, factual consistency, and specificity. Through automatic metrics and human evaluation across multiple textbook domains, the study finds that role-playing LLMs generally yield the strongest overall dialogue quality, albeit with hallucination and repetition challenges, while grounding-based methods excel in informativeness and groundedness. Pre-training educational chatbots on such textbook-derived synthetic data can improve downstream performance when domain alignment with the target task exists, indicating a practical pathway for cost-effective chatbot development. The work also discusses limitations and ethical considerations, emphasizing the need to balance data size and quality and to avoid overreliance on synthetic data in real classrooms.

Abstract

Educational chatbots are a promising tool for assisting student learning. However, the development of effective chatbots in education has been challenging, as high-quality data is seldom available in this domain. In this paper, we propose a framework for generating synthetic teacher-student interactions grounded in a set of textbooks. Our approaches capture one aspect of learning interactions where curious students with partial knowledge interactively ask a teacher questions about the material in the textbook. We highlight various quality criteria that such dialogues should fulfill and compare several approaches relying on either prompting or fine-tuning large language models. We use synthetic dialogues to train educational chatbots and show benefits of further fine-tuning in different educational domains. However, human evaluation shows that our best data synthesis method still suffers from hallucinations and tends to reiterate information from previous conversations. Our findings offer insights for future efforts in synthesizing conversational data that strikes a balance between size and quality. We will open-source our data and code.

Book2Dial: Generating Teacher-Student Interactions from Textbooks for Cost-Effective Development of Educational Chatbots

TL;DR

This paper tackles the scarcity of high-quality data for educational chatbots by proposing Book2Dial, a framework that generates synthetic teacher-student dialogues grounded in open textbooks. It formalizes a versatile pipeline with three instantiations—multi-turn QG-QA, dialogue inpainting, and persona-based generation—alongside a concrete quality rubric covering relevance, coherence, informativeness, grounding, answerability, factual consistency, and specificity. Through automatic metrics and human evaluation across multiple textbook domains, the study finds that role-playing LLMs generally yield the strongest overall dialogue quality, albeit with hallucination and repetition challenges, while grounding-based methods excel in informativeness and groundedness. Pre-training educational chatbots on such textbook-derived synthetic data can improve downstream performance when domain alignment with the target task exists, indicating a practical pathway for cost-effective chatbot development. The work also discusses limitations and ethical considerations, emphasizing the need to balance data size and quality and to avoid overreliance on synthetic data in real classrooms.

Abstract

Educational chatbots are a promising tool for assisting student learning. However, the development of effective chatbots in education has been challenging, as high-quality data is seldom available in this domain. In this paper, we propose a framework for generating synthetic teacher-student interactions grounded in a set of textbooks. Our approaches capture one aspect of learning interactions where curious students with partial knowledge interactively ask a teacher questions about the material in the textbook. We highlight various quality criteria that such dialogues should fulfill and compare several approaches relying on either prompting or fine-tuning large language models. We use synthetic dialogues to train educational chatbots and show benefits of further fine-tuning in different educational domains. However, human evaluation shows that our best data synthesis method still suffers from hallucinations and tends to reiterate information from previous conversations. Our findings offer insights for future efforts in synthesizing conversational data that strikes a balance between size and quality. We will open-source our data and code.
Paper Structure (67 sections, 3 equations, 2 figures, 19 tables)

This paper contains 67 sections, 3 equations, 2 figures, 19 tables.

Figures (2)

  • Figure 1: Example of a synthetic teacher-student interaction based on a textbook, along with various criteria for evaluating the quality of the interaction. The criteria include Answer Relevance of the answer to the question, Coherence of the question-answer interaction with respect to the dialogue history, Informativeness of the overall interaction, Groundedness to the textbook, Answerability of the question from the textbook, Factual Consistency of the answer with respect to the question, and Specificity of the question. More details in Section \ref{['sec:criteria']}.
  • Figure 2: Book2Dial Framework for Generating Dialogues from Textbooks: Our approach uses two models -- a Student model and a Teacher model. The Student model plays the role of a student, formulating questions from a limited context (document formatting). In contrast, the Teacher model assumes the role of a teacher, providing answers and guidance by referencing the (sub-)section in the textbook. This framework can be adapted to various instantiations of the two roles with varying formatting information, such as multi-turn QA-QG models kim2022generating, Dialogue Inpainting dai2022dialog, and a prompting approach utilizing role-playing LLMs.