ES-KT-24: A Multimodal Knowledge Tracing Benchmark Dataset with Educational Game Playing Video and Synthetic Text Generation

Dohee Kim; Unggi Lee; Sookbun Lee; Jiyeong Bae; Taekyung Ahn; Jaekwon Park; Gunho Lee; Hyeoncheol Kim

ES-KT-24: A Multimodal Knowledge Tracing Benchmark Dataset with Educational Game Playing Video and Synthetic Text Generation

Dohee Kim, Unggi Lee, Sookbun Lee, Jiyeong Bae, Taekyung Ahn, Jaekwon Park, Gunho Lee, Hyeoncheol Kim

TL;DR

By integrating game-playing videos and detailed game logs, this dataset offers a unique approach to dissecting student learning patterns through advanced data analysis and machine-learning techniques, which has the potential to unearth new insights into the learning process and inspire further exploration in the field.

Abstract

This paper introduces ES-KT-24, a novel multimodal Knowledge Tracing (KT) dataset for intelligent tutoring systems in educational game contexts. Although KT is crucial in adaptive learning, existing datasets often lack game-based and multimodal elements. ES-KT-24 addresses these limitations by incorporating educational game-playing videos, synthetically generated question text, and detailed game logs. The dataset covers Mathematics, English, Indonesian, and Malaysian subjects, emphasizing diversity and including non-English content. The synthetic text component, generated using a large language model, encompasses 28 distinct knowledge concepts and 182 questions, featuring 15,032 users and 7,782,928 interactions. Our benchmark experiments demonstrate the dataset's utility for KT research by comparing Deep learning-based KT models with Language Model-based Knowledge Tracing (LKT) approaches. Notably, LKT models showed slightly higher performance than traditional DKT models, highlighting the potential of language model-based approaches in this field. Furthermore, ES-KT-24 has the potential to significantly advance research in multimodal KT models and learning analytics. By integrating game-playing videos and detailed game logs, this dataset offers a unique approach to dissecting student learning patterns through advanced data analysis and machine-learning techniques. It has the potential to unearth new insights into the learning process and inspire further exploration in the field.

ES-KT-24: A Multimodal Knowledge Tracing Benchmark Dataset with Educational Game Playing Video and Synthetic Text Generation

TL;DR

Abstract

Paper Structure (18 sections, 5 figures, 2 tables)

This paper contains 18 sections, 5 figures, 2 tables.

Introduction
Related Work
Knowledge Tracing Datasets
Knowledge Tracing Models
Data Description
License
Data Collection
Synthethic Data Generation
Data Cleaning and Processing
Statistics
The ES-KT-24 Benchmark
Experiment Setting
Baseline Models
Performance Analysis
Additional Data Usage for Educational Research
...and 3 more sections

Figures (5)

Figure 1: An example of ES-KT-24. ES-KT-24 consists of a multimodal dataset, including game-playing video, synthetic question text, knowledge concept (KC) text, and game logs collected from educational game contexts.
Figure 2: The process began with the manual game-play of educational games, which were screen-recorded. These recordings were then converted to text using OpenAI GPT-4o openai2024gpt4o for visual content and Whisper pmlr-v202-radford23a for audio transcription. The resulting text was utilized to create Questions corresponding to each game. Student problem-solving histories and game logs were preprocessed and explored through Exploratory Data Analysis (EDA), then transformed into sequence data suitable for KT tasks. Finally, this text and sequence data were released as a paired dataset.
Figure 3: KT dataset format. Left shows the exercising process of a student, where the student has already done three questions and will answer question number 4. Right shows the corresponding materials of questions that contain their contents and KCs. Note that question and KC texts are used for LKT.
Figure 4: Data analysis results. Upper figure (a) to (f) are Violin Plots of Sequence Length, Correct Answer Ratio, and Learning Time. Lower figure (g) and (h) are Sequence Time of Users with Subject Comparisons.
Figure 5: Flowchart for Correctness Determination Based on Game-play Interaction Data

ES-KT-24: A Multimodal Knowledge Tracing Benchmark Dataset with Educational Game Playing Video and Synthetic Text Generation

TL;DR

Abstract

ES-KT-24: A Multimodal Knowledge Tracing Benchmark Dataset with Educational Game Playing Video and Synthetic Text Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (5)