Table of Contents
Fetching ...

From Words to Wisdom: Discourse Annotation and Baseline Models for Student Dialogue Understanding

Farjana Sultana Mim, Shuchin Aeron, Eric Miller, Kristen Wendell

TL;DR

This paper tackles automatic detection of Knowledge Construction (KC) versus Task Production (TP) discourse in student dialogues and introduces a new annotated educational dataset. It establishes baseline KCTP prediction models using prompting-based GPT-3.5 and open-source LLaMA-3.1, plus instruction fine-tuning, evaluated under zero-shot, few-shot, and fine-tuning regimes. Results show suboptimal performance of current large language models on this specialized educational discourse task, with best $F1$ around $0.57$, underscoring the need for improved methods. The work provides a scalable framework to study how curricular and pedagogical factors cue knowledge construction in collaborative learning.

Abstract

Identifying discourse features in student conversations is quite important for educational researchers to recognize the curricular and pedagogical variables that cause students to engage in constructing knowledge rather than merely completing tasks. The manual analysis of student conversations to identify these discourse features is time-consuming and labor-intensive, which limits the scale and scope of studies. Leveraging natural language processing (NLP) techniques can facilitate the automatic detection of these discourse features, offering educational researchers scalable and data-driven insights. However, existing studies in NLP that focus on discourse in dialogue rarely address educational data. In this work, we address this gap by introducing an annotated educational dialogue dataset of student conversations featuring knowledge construction and task production discourse. We also establish baseline models for automatically predicting these discourse properties for each turn of talk within conversations, using pre-trained large language models GPT-3.5 and Llama-3.1. Experimental results indicate that these state-of-the-art models perform suboptimally on this task, indicating the potential for future research.

From Words to Wisdom: Discourse Annotation and Baseline Models for Student Dialogue Understanding

TL;DR

This paper tackles automatic detection of Knowledge Construction (KC) versus Task Production (TP) discourse in student dialogues and introduces a new annotated educational dataset. It establishes baseline KCTP prediction models using prompting-based GPT-3.5 and open-source LLaMA-3.1, plus instruction fine-tuning, evaluated under zero-shot, few-shot, and fine-tuning regimes. Results show suboptimal performance of current large language models on this specialized educational discourse task, with best around , underscoring the need for improved methods. The work provides a scalable framework to study how curricular and pedagogical factors cue knowledge construction in collaborative learning.

Abstract

Identifying discourse features in student conversations is quite important for educational researchers to recognize the curricular and pedagogical variables that cause students to engage in constructing knowledge rather than merely completing tasks. The manual analysis of student conversations to identify these discourse features is time-consuming and labor-intensive, which limits the scale and scope of studies. Leveraging natural language processing (NLP) techniques can facilitate the automatic detection of these discourse features, offering educational researchers scalable and data-driven insights. However, existing studies in NLP that focus on discourse in dialogue rarely address educational data. In this work, we address this gap by introducing an annotated educational dialogue dataset of student conversations featuring knowledge construction and task production discourse. We also establish baseline models for automatically predicting these discourse properties for each turn of talk within conversations, using pre-trained large language models GPT-3.5 and Llama-3.1. Experimental results indicate that these state-of-the-art models perform suboptimally on this task, indicating the potential for future research.

Paper Structure

This paper contains 24 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Students' homework discussion's snippet of knowledge construction and task production discourse.
  • Figure 2: Topic distribution across the dataset
  • Figure 3: Distribution of categories across the dataset.
  • Figure 4: Confusion matrix of dual annotations
  • Figure 5: Confusion matrix of model prediction vs. true annotated labels