Table of Contents
Fetching ...

BIPED: Pedagogically Informed Tutoring System for ESL Education

Soonwoo Kwon, Sojung Kim, Minju Park, Seunghyun Lee, Kyuseok Kim

TL;DR

The paper tackles the scarcity of pedagogically deep, bilingual CITS for ESL by introducing BIPED, a bilingual tutoring dataset annotated with 34 tutor acts and 9 student acts. It adopts a two-step act-prediction and utterance-generation framework, implemented via GPT-4 prompting and SOLAR-KO instruction-tuning, to produce tutor utterances that mimic human teaching with diverse strategies. Empirical results show the fine-tuned model achieves strong utterance quality and broad act diversity, demonstrating the approach’s potential to deliver pedagogically informed ESL tutoring. The work advances ESL CITS by enabling controllable, interpretable tutoring behavior, while also outlining limitations and directions for interactive evaluation and scaling to larger models.

Abstract

Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) for teaching L2 learners of English. Existing CITS, however, are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. To develop a more pedagogically informed CITS capable of teaching complex concepts, we construct a BIlingual PEDagogically-informed Tutoring Dataset (BIPED) of one-on-one, human-to-human English tutoring interactions. Through post-hoc analysis of the tutoring interactions, we come up with a lexicon of dialogue acts (34 tutor acts and 9 student acts), which we use to further annotate the collected dataset. Based on a two-step framework of first predicting the appropriate tutor act then generating the corresponding response, we implemented two CITS models using GPT-4 and SOLAR-KO, respectively. We experimentally demonstrate that the implemented models not only replicate the style of human teachers but also employ diverse and contextually appropriate pedagogical strategies.

BIPED: Pedagogically Informed Tutoring System for ESL Education

TL;DR

The paper tackles the scarcity of pedagogically deep, bilingual CITS for ESL by introducing BIPED, a bilingual tutoring dataset annotated with 34 tutor acts and 9 student acts. It adopts a two-step act-prediction and utterance-generation framework, implemented via GPT-4 prompting and SOLAR-KO instruction-tuning, to produce tutor utterances that mimic human teaching with diverse strategies. Empirical results show the fine-tuned model achieves strong utterance quality and broad act diversity, demonstrating the approach’s potential to deliver pedagogically informed ESL tutoring. The work advances ESL CITS by enabling controllable, interpretable tutoring behavior, while also outlining limitations and directions for interactive evaluation and scaling to larger models.

Abstract

Large Language Models (LLMs) have a great potential to serve as readily available and cost-efficient Conversational Intelligent Tutoring Systems (CITS) for teaching L2 learners of English. Existing CITS, however, are designed to teach only simple concepts or lack the pedagogical depth necessary to address diverse learning strategies. To develop a more pedagogically informed CITS capable of teaching complex concepts, we construct a BIlingual PEDagogically-informed Tutoring Dataset (BIPED) of one-on-one, human-to-human English tutoring interactions. Through post-hoc analysis of the tutoring interactions, we come up with a lexicon of dialogue acts (34 tutor acts and 9 student acts), which we use to further annotate the collected dataset. Based on a two-step framework of first predicting the appropriate tutor act then generating the corresponding response, we implemented two CITS models using GPT-4 and SOLAR-KO, respectively. We experimentally demonstrate that the implemented models not only replicate the style of human teachers but also employ diverse and contextually appropriate pedagogical strategies.
Paper Structure (31 sections, 3 figures, 3 tables)

This paper contains 31 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Example of our dataset, BIPED. It includes a series of dialogues between a tutor and a student, annotated with dialogue acts, content information, and the correctness of student responses.
  • Figure 2: Distribution of tutor acts in BIPED
  • Figure 3: The comparative distributions of chosen tutor acts. The grey bars represent the distribution of tutor acts in the test set while the blue bars denote the distribution of chosen tutor acts by Base GPT, our fine-tuned model, and GPT-4 (1-shot).