Table of Contents
Fetching ...

Developing a Tutoring Dialog Dataset to Optimize LLMs for Educational Use

Menna Fateen, Tsunenori Mine

TL;DR

This study developed a synthetic tutoring dialog dataset, evaluated by human teachers, and fine-tuned a smaller LLM using this dataset, demonstrating a viable, cost-effective approach for implementing LLM-based tutoring systems in educational settings.

Abstract

Recent advances in large language models (LLMs) have shown promise for scalable educational applications, but their use in dialog-based tutoring systems remains challenging due to the need for effective pedagogical strategies and the high costs associated with expert-curated datasets. Our study explores the use of smaller, more affordable LLMs for one-on-one tutoring in the context of solving reading comprehension problems. We developed a synthetic tutoring dialog dataset, evaluated by human teachers, and fine-tuned a smaller LLM using this dataset. Furthermore, we conducted an interactive experiment comparing the performance of the fine-tuned model with a larger model in real-world tutoring scenarios. Our results show that the fine-tuned model performs on par with the larger model but at a lower cost, demonstrating a viable, cost-effective approach for implementing LLM-based tutoring systems in educational settings.

Developing a Tutoring Dialog Dataset to Optimize LLMs for Educational Use

TL;DR

This study developed a synthetic tutoring dialog dataset, evaluated by human teachers, and fine-tuned a smaller LLM using this dataset, demonstrating a viable, cost-effective approach for implementing LLM-based tutoring systems in educational settings.

Abstract

Recent advances in large language models (LLMs) have shown promise for scalable educational applications, but their use in dialog-based tutoring systems remains challenging due to the need for effective pedagogical strategies and the high costs associated with expert-curated datasets. Our study explores the use of smaller, more affordable LLMs for one-on-one tutoring in the context of solving reading comprehension problems. We developed a synthetic tutoring dialog dataset, evaluated by human teachers, and fine-tuned a smaller LLM using this dataset. Furthermore, we conducted an interactive experiment comparing the performance of the fine-tuned model with a larger model in real-world tutoring scenarios. Our results show that the fine-tuned model performs on par with the larger model but at a lower cost, demonstrating a viable, cost-effective approach for implementing LLM-based tutoring systems in educational settings.

Paper Structure

This paper contains 23 sections, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Dataset construction method.
  • Figure 2: Dialog tutoring generation process.
  • Figure 3: Talktime distribution for tutor and student agents.
  • Figure 4: Response value counts per dimension.
  • Figure 5: Correlation in tutor agent rating dimensions.
  • ...and 3 more figures