Table of Contents
Fetching ...

Live Knowledge Tracing: Real-Time Adaptation using Tabular Foundation Models

Mounir Lbath, Alexandre Paresy, Abdelkayoum Kaddouri, Alan André, Alexandre Ittah, Jill-Jênn Vie

TL;DR

The paper proposes Live Knowledge Tracing (liveKT) using Tabular Foundation Models (TFMs) to perform trainless, online inference for student interactions. Central to the approach is a two-way attention mechanism across time steps and across training rows, enabling alignment of a test sequence with relevant training sequences and predicting $p(c_T|q_{1:T}, s_{1:T}, c_{1:{T-1}}, \mathcal{D}_{train})$ without offline retraining. Empirical results across multiple datasets show competitive AUC with substantial wall-clock speedups (up to $273\times$) versus fully trained models, especially as more interactions are observed, demonstrating practical viability for live educational settings. The work also discusses interpretability opportunities via attention patterns and outlines future directions including domain-specific pretraining of TFMs and deeper analyses of model behavior for instructional decision-making and fairness.

Abstract

Deep knowledge tracing models have achieved significant breakthroughs in modeling student learning trajectories. However, these architectures require substantial training time and are prone to overfitting on datasets with short sequences. In this paper, we explore a new paradigm for knowledge tracing by leveraging tabular foundation models (TFMs). Unlike traditional methods that require offline training on a fixed training set, our approach performs real-time ''live'' knowledge tracing in an online way. The core of our method lies in a two-way attention mechanism: while attention knowledge tracing models only attend across earlier time steps, TFMs simultaneously attend across both time steps and interactions of other students in the training set. They align testing sequences with relevant training sequences at inference time, therefore skipping the training step entirely. We demonstrate, using several datasets of increasing size, that our method achieves competitive predictive performance with up to 273x speedups, in a setting where more student interactions are observed over time.

Live Knowledge Tracing: Real-Time Adaptation using Tabular Foundation Models

TL;DR

The paper proposes Live Knowledge Tracing (liveKT) using Tabular Foundation Models (TFMs) to perform trainless, online inference for student interactions. Central to the approach is a two-way attention mechanism across time steps and across training rows, enabling alignment of a test sequence with relevant training sequences and predicting without offline retraining. Empirical results across multiple datasets show competitive AUC with substantial wall-clock speedups (up to ) versus fully trained models, especially as more interactions are observed, demonstrating practical viability for live educational settings. The work also discusses interpretability opportunities via attention patterns and outlines future directions including domain-specific pretraining of TFMs and deeper analyses of model behavior for instructional decision-making and fairness.

Abstract

Deep knowledge tracing models have achieved significant breakthroughs in modeling student learning trajectories. However, these architectures require substantial training time and are prone to overfitting on datasets with short sequences. In this paper, we explore a new paradigm for knowledge tracing by leveraging tabular foundation models (TFMs). Unlike traditional methods that require offline training on a fixed training set, our approach performs real-time ''live'' knowledge tracing in an online way. The core of our method lies in a two-way attention mechanism: while attention knowledge tracing models only attend across earlier time steps, TFMs simultaneously attend across both time steps and interactions of other students in the training set. They align testing sequences with relevant training sequences at inference time, therefore skipping the training step entirely. We demonstrate, using several datasets of increasing size, that our method achieves competitive predictive performance with up to 273x speedups, in a setting where more student interactions are observed over time.
Paper Structure (9 sections, 1 equation, 1 figure, 2 tables)

This paper contains 9 sections, 1 equation, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Results for Assistments 2009 and POJ datasets. Top is performance as AUC, bottom is total time spent between (optional) training and testing.