Table of Contents
Fetching ...

Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

Prarthana Bhattacharyya, Joshua Mitton, Ralph Abboud, Simon Woodhead

TL;DR

It is shown that KT models outperform LLMs with respect to accuracy and F1 scores on this domain-specific task and the fact that current closed source LLMs should not be used as a universal solution for all tasks is highlighted.

Abstract

Predicting future student responses to questions is particularly valuable for educational learning platforms where it enables effective interventions. One of the key approaches to do this has been through the use of knowledge tracing (KT) models. These are small, domain-specific, temporal models trained on student question-response data. KT models are optimised for high accuracy on specific educational domains and have fast inference and scalable deployments. The rise of Large Language Models (LLMs) motivates us to ask the following questions: (1) How well can LLMs perform at predicting students' future responses to questions? (2) Are LLMs scalable for this domain? (3) How do LLMs compare to KT models on this domain-specific task? In this paper, we compare multiple LLMs and KT models across predictive performance, deployment cost, and inference speed to answer the above questions. We show that KT models outperform LLMs with respect to accuracy and F1 scores on this domain-specific task. Further, we demonstrate that LLMs are orders of magnitude slower than KT models and cost orders of magnitude more to deploy. This highlights the importance of domain-specific models for education prediction tasks and the fact that current closed source LLMs should not be used as a universal solution for all tasks.

Faster, Cheaper, More Accurate: Specialised Knowledge Tracing Models Outperform LLMs

TL;DR

It is shown that KT models outperform LLMs with respect to accuracy and F1 scores on this domain-specific task and the fact that current closed source LLMs should not be used as a universal solution for all tasks is highlighted.

Abstract

Predicting future student responses to questions is particularly valuable for educational learning platforms where it enables effective interventions. One of the key approaches to do this has been through the use of knowledge tracing (KT) models. These are small, domain-specific, temporal models trained on student question-response data. KT models are optimised for high accuracy on specific educational domains and have fast inference and scalable deployments. The rise of Large Language Models (LLMs) motivates us to ask the following questions: (1) How well can LLMs perform at predicting students' future responses to questions? (2) Are LLMs scalable for this domain? (3) How do LLMs compare to KT models on this domain-specific task? In this paper, we compare multiple LLMs and KT models across predictive performance, deployment cost, and inference speed to answer the above questions. We show that KT models outperform LLMs with respect to accuracy and F1 scores on this domain-specific task. Further, we demonstrate that LLMs are orders of magnitude slower than KT models and cost orders of magnitude more to deploy. This highlights the importance of domain-specific models for education prediction tasks and the fact that current closed source LLMs should not be used as a universal solution for all tasks.
Paper Structure (18 sections, 2 equations, 6 figures, 3 tables)

This paper contains 18 sections, 2 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Comparing specialised Knowledge Tracing (KT) models with Large Language Models (LLMs) for students' future performance predictions.
  • Figure 2: LLM prompt template for student response prediction.
  • Figure 3: Comparison of model performance.
  • Figure 4: Latency and model size comparison for different models.
  • Figure 5: Annual inference cost comparison for 100,000 students receiving 40 predictions per year.
  • ...and 1 more figures