Enhancing Deep Knowledge Tracing via Diffusion Models for Personalized Adaptive Learning
Ming Kuo, Shouvon Sarker, Lijun Qian, Yujian Fu, Xiangfang Li, Xishuang Dong
TL;DR
The paper addresses data scarcity in deep knowledge tracing for personalized adaptive learning by using TabDDPM, a diffusion model tailored for tabular data, to generate synthetic educational records. These synthetic records are used to augment training data for Deep Knowledge Tracing (DKT), yielding improved performance on the ASSISTments2009 dataset, particularly when real training data are scarce. The approach demonstrates that increasing amounts of AI-generated data can boost accuracy, AUC, precision, and recall while stabilizing performance. This diffusion-based data augmentation offers a practical path to enhance PAL systems and motivates extending synthetic data generation to broader KT tasks and personalized learning path recommendations.
Abstract
In contrast to pedagogies like evidence-based teaching, personalized adaptive learning (PAL) distinguishes itself by closely monitoring the progress of individual students and tailoring the learning path to their unique knowledge and requirements. A crucial technique for effective PAL implementation is knowledge tracing, which models students' evolving knowledge to predict their future performance. Based on these predictions, personalized recommendations for resources and learning paths can be made to meet individual needs. Recent advancements in deep learning have successfully enhanced knowledge tracking through Deep Knowledge Tracing (DKT). This paper introduces generative AI models to further enhance DKT. Generative AI models, rooted in deep learning, are trained to generate synthetic data, addressing data scarcity challenges in various applications across fields such as natural language processing (NLP) and computer vision (CV). This study aims to tackle data shortage issues in student learning records to enhance DKT performance for PAL. Specifically, it employs TabDDPM, a diffusion model, to generate synthetic educational records to augment training data for enhancing DKT. The proposed method's effectiveness is validated through extensive experiments on ASSISTments datasets. The experimental results demonstrate that the AI-generated data by TabDDPM significantly improves DKT performance, particularly in scenarios with small data for training and large data for testing.
