Table of Contents
Fetching ...

A Survey of Knowledge Tracing: Models, Variants, and Applications

Shuanghong Shen, Qi Liu, Zhenya Huang, Yonghe Zheng, Minghao Yin, Minjuan Wang, Enhong Chen

TL;DR

This survey provides a comprehensive taxonomy of Knowledge Tracing (KT), organizing fundamental KT models into Bayesian, logistic, and deep learning families and detailing representative methods such as BKT, DBKT, LFA, PFA, KTM, DKT, DKVMN, SAINT, AKT, and GKT. It highlights four KT variants—modeling individualization before learning, engagement during learning, forgetting after learning, and side information across learning—demonstrating how real-world learning contexts shape model enhancements. The authors also summarize typical KT applications in learning resources recommendation, adaptive learning, and broader domains, and release EduData and EduKTM to standardize datasets and baselines for reproducibility. The paper identifies key research opportunities in interpretability, data sparsity, subjective exercises, learner feedback, general user modeling, and integration with Large Language Models, underscoring KT’s potential to improve personalized education at scale.

Abstract

Modern online education has the capacity to provide intelligent educational services by automatically analyzing substantial amounts of student behavioral data. Knowledge Tracing (KT) is one of the fundamental tasks for student behavioral data analysis, aiming to monitor students' evolving knowledge state during their problem-solving process. In recent years, a substantial number of studies have concentrated on this rapidly growing field, significantly contributing to its advancements. In this survey, we will conduct a thorough investigation of these progressions. Firstly, we present three types of fundamental KT models with distinct technical routes. Subsequently, we review extensive variants of the fundamental KT models that consider more stringent learning assumptions. Moreover, the development of KT cannot be separated from its applications, thereby we present typical KT applications in various scenarios. To facilitate the work of researchers and practitioners in this field, we have developed two open-source algorithm libraries: EduData that enables the download and preprocessing of KT-related datasets, and EduKTM that provides an extensible and unified implementation of existing mainstream KT models. Finally, we discuss potential directions for future research in this rapidly growing field. We hope that the current survey will assist both researchers and practitioners in fostering the development of KT, thereby benefiting a broader range of students.

A Survey of Knowledge Tracing: Models, Variants, and Applications

TL;DR

This survey provides a comprehensive taxonomy of Knowledge Tracing (KT), organizing fundamental KT models into Bayesian, logistic, and deep learning families and detailing representative methods such as BKT, DBKT, LFA, PFA, KTM, DKT, DKVMN, SAINT, AKT, and GKT. It highlights four KT variants—modeling individualization before learning, engagement during learning, forgetting after learning, and side information across learning—demonstrating how real-world learning contexts shape model enhancements. The authors also summarize typical KT applications in learning resources recommendation, adaptive learning, and broader domains, and release EduData and EduKTM to standardize datasets and baselines for reproducibility. The paper identifies key research opportunities in interpretability, data sparsity, subjective exercises, learner feedback, general user modeling, and integration with Large Language Models, underscoring KT’s potential to improve personalized education at scale.

Abstract

Modern online education has the capacity to provide intelligent educational services by automatically analyzing substantial amounts of student behavioral data. Knowledge Tracing (KT) is one of the fundamental tasks for student behavioral data analysis, aiming to monitor students' evolving knowledge state during their problem-solving process. In recent years, a substantial number of studies have concentrated on this rapidly growing field, significantly contributing to its advancements. In this survey, we will conduct a thorough investigation of these progressions. Firstly, we present three types of fundamental KT models with distinct technical routes. Subsequently, we review extensive variants of the fundamental KT models that consider more stringent learning assumptions. Moreover, the development of KT cannot be separated from its applications, thereby we present typical KT applications in various scenarios. To facilitate the work of researchers and practitioners in this field, we have developed two open-source algorithm libraries: EduData that enables the download and preprocessing of KT-related datasets, and EduKTM that provides an extensible and unified implementation of existing mainstream KT models. Finally, we discuss potential directions for future research in this rapidly growing field. We hope that the current survey will assist both researchers and practitioners in fostering the development of KT, thereby benefiting a broader range of students.

Paper Structure

This paper contains 45 sections, 20 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: A simple schematic diagram of knowledge tracing. Different knowledge concepts are represented in different colors, while exercises are also depicted in the color relevant to the knowledge concepts. During the learning process, different kinds of side information are also recorded. The evolving process of the knowledge state is assessed by KT models and illustrated by the radar maps.
  • Figure 2: An overview of knowledge tracing models.
  • Figure 3: The topology of Bayesian Knowledge Tracing corbett1994knowledge. $K$ are the unobserved knowledge nodes, $A$ are the observed performance (answer) nodes, $P(L_0)$ represents the initial probability, $P(T)$ is the transition probability, $P(G)$ is the guessing probability and $P(S)$ is the slipping probability.
  • Figure 4: Example of activation of a knowledge tracing machine vie2019knowledge. $V$ refers to the matrix of embeddings, $w$ refers to the vector of biases, $x$ is the encoding vector of the learning interaction.
  • Figure 5: The architecture of DKT piech2015deep.
  • ...and 2 more figures