Table of Contents
Fetching ...

MAML-KT: Addressing Cold Start Problem in Knowledge Tracing for New Students via Few-Shot Model-Agnostic Meta Learning

Indronil Bhattacharjee, Christabel Wayllace

TL;DR

MAML-KT is introduced, a model-agnostic meta learning approach that learns an initialization optimized for rapid adaptation to new students using one or two gradient updates that achieves higher early accuracy than prior KT models in nearly all cold start conditions, with gains persisting as cohort size increases.

Abstract

Knowledge tracing (KT) models are commonly evaluated by training on early interactions from all students and testing on later responses. While effective for measuring average predictive performance, this evaluation design obscures a cold start scenario that arises in deployment, where models must infer the knowledge state of previously unseen students from only a few initial interactions. Prior studies have shown that under this setting, standard empirically risk-minimized KT models such as DKT, DKVMN and SAKT exhibit substantially lower early accuracy than previously reported. We frame new-student performance prediction as a few-shot learning problem and introduce MAML-KT, a model-agnostic meta learning approach that learns an initialization optimized for rapid adaptation to new students using one or two gradient updates. We evaluate MAML-KT on ASSIST2009, ASSIST2015 and ASSIST2017 using a controlled cold start protocol that trains on a subset of students and tests on held-out learners across early interaction windows (questions 3-10 and 11-15), scaling cohort sizes from 10 to 50 students. Across datasets, MAML-KT achieves higher early accuracy than prior KT models in nearly all cold start conditions, with gains persisting as cohort size increases. On ASSIST2017, we observe a transient drop in early performance that coincides with many students encountering previously unseen skills. Further analysis suggests that these drops coincide with skill novelty rather than model instability, consistent with prior work on skill-level cold start. Overall, optimizing KT models for rapid adaptation reduces early prediction error for new students and provides a clearer lens for interpreting early accuracy fluctuations, distinguishing model limitations from genuine learning and knowledge acquisition dynamics.

MAML-KT: Addressing Cold Start Problem in Knowledge Tracing for New Students via Few-Shot Model-Agnostic Meta Learning

TL;DR

MAML-KT is introduced, a model-agnostic meta learning approach that learns an initialization optimized for rapid adaptation to new students using one or two gradient updates that achieves higher early accuracy than prior KT models in nearly all cold start conditions, with gains persisting as cohort size increases.

Abstract

Knowledge tracing (KT) models are commonly evaluated by training on early interactions from all students and testing on later responses. While effective for measuring average predictive performance, this evaluation design obscures a cold start scenario that arises in deployment, where models must infer the knowledge state of previously unseen students from only a few initial interactions. Prior studies have shown that under this setting, standard empirically risk-minimized KT models such as DKT, DKVMN and SAKT exhibit substantially lower early accuracy than previously reported. We frame new-student performance prediction as a few-shot learning problem and introduce MAML-KT, a model-agnostic meta learning approach that learns an initialization optimized for rapid adaptation to new students using one or two gradient updates. We evaluate MAML-KT on ASSIST2009, ASSIST2015 and ASSIST2017 using a controlled cold start protocol that trains on a subset of students and tests on held-out learners across early interaction windows (questions 3-10 and 11-15), scaling cohort sizes from 10 to 50 students. Across datasets, MAML-KT achieves higher early accuracy than prior KT models in nearly all cold start conditions, with gains persisting as cohort size increases. On ASSIST2017, we observe a transient drop in early performance that coincides with many students encountering previously unseen skills. Further analysis suggests that these drops coincide with skill novelty rather than model instability, consistent with prior work on skill-level cold start. Overall, optimizing KT models for rapid adaptation reduces early prediction error for new students and provides a clearer lens for interpreting early accuracy fluctuations, distinguishing model limitations from genuine learning and knowledge acquisition dynamics.
Paper Structure (28 sections, 3 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 28 sections, 3 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Accuracy Scores vs. Number of Questions for 10 New Students across Models
  • Figure 2: Critical Cold Start (Questions 3-10): Average Accuracy across 5 Datasets $\times$ 4 Models $\times$ 3 Cohort Sizes
  • Figure 3: Moderate Cold Start (Questions 11-15): Average Accuracy across 5 Datasets $\times$ 4 Models $\times$ 3 Cohort Sizes
  • Figure 4: Assist2017 - 20 New Students - Set 2, Questions 6-8 and 10-12 Explanation (a) Model Accuracy vs Questions (b) Per Student Answer Accuracy by Skill vs Questions (The lines represent a specific skill and introduction of new skills in the Q6-8 and Q10-12 are marked with red circles)