Table of Contents
Fetching ...

KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students

Matthew Shu, Nishant Balepur, Shi Feng, Jordan Boyd-Graber

TL;DR

KARL, a simple but effective content-aware student model employing deep knowledge tracing, retrieval, and BERT to predict student recall, is built and improves learning efficiency over SOTA, showing KARL’s strength and encouraging researchers to look beyond historical study data to fully capture student abilities.

Abstract

Flashcard schedulers rely on 1) student models to predict the flashcards a student knows; and 2) teaching policies to pick which cards to show next via these predictions. Prior student models, however, just use study data like the student's past responses, ignoring the text on cards. We propose content-aware scheduling, the first schedulers exploiting flashcard content. To give the first evidence that such schedulers enhance student learning, we build KARL, a simple but effective content-aware student model employing deep knowledge tracing (DKT), retrieval, and BERT to predict student recall. We train KARL by collecting a new dataset of 123,143 study logs on diverse trivia questions. KARL bests existing student models in AUC and calibration error. To ensure our improved predictions lead to better student learning, we create a novel delta-based teaching policy to deploy KARL online. Based on 32 study paths from 27 users, KARL improves learning efficiency over SOTA, showing KARL's strength and encouraging researchers to look beyond historical study data to fully capture student abilities.

KARL: Knowledge-Aware Retrieval and Representations aid Retention and Learning in Students

TL;DR

KARL, a simple but effective content-aware student model employing deep knowledge tracing, retrieval, and BERT to predict student recall, is built and improves learning efficiency over SOTA, showing KARL’s strength and encouraging researchers to look beyond historical study data to fully capture student abilities.

Abstract

Flashcard schedulers rely on 1) student models to predict the flashcards a student knows; and 2) teaching policies to pick which cards to show next via these predictions. Prior student models, however, just use study data like the student's past responses, ignoring the text on cards. We propose content-aware scheduling, the first schedulers exploiting flashcard content. To give the first evidence that such schedulers enhance student learning, we build KARL, a simple but effective content-aware student model employing deep knowledge tracing (DKT), retrieval, and BERT to predict student recall. We train KARL by collecting a new dataset of 123,143 study logs on diverse trivia questions. KARL bests existing student models in AUC and calibration error. To ensure our improved predictions lead to better student learning, we create a novel delta-based teaching policy to deploy KARL online. Based on 32 study paths from 27 users, KARL improves learning efficiency over SOTA, showing KARL's strength and encouraging researchers to look beyond historical study data to fully capture student abilities.
Paper Structure (38 sections, 3 equations, 8 figures, 11 tables)

This paper contains 38 sections, 3 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Overview of $\text{KAR}^\text{3}\text{L}$. Given a current flashcard and the student's study history as inputs, $\text{KAR}^\text{3}\text{L}$ first uses a BERT retriever to obtain the most semantically similar cards from the study history. Next, the BERT embeddings of these retrieved flashcards, the embedding of the current flashcard, and flashcard-level features (e.g. time since last review), are fed through a classifier (CLF) to predict if the student knows the answer to the current flashcard.
  • Figure 2: Screenshot from our web-based flashcard app after a user submits their answer to a literature flashcard.
  • Figure 3: Forgetting curve for US history cards. When Card 1 is studied, $\text{KAR}^\text{3}\text{L}$'s prediction of the semantically related Card 2 increases, despite not being studied.
  • Figure 4: Top-3 retrieved vs past-3 studied flashcards when the user studies a new card on Japanese literature.
  • Figure 5: The average forgetting curve ten days before and after a study (day zero) for both when the user succeeds and fails at recalling the flashcard. Unlike exponential forgetting models, the convexity of our forgetting curve depends on both the current predicted recall and the outcome of the most recent study, adding more flexibility.
  • ...and 3 more figures