Table of Contents
Fetching ...

Mastery Guided Non-parametric Clustering to Scale-up Strategy Prediction

Anup Shakya, Vasile Rus, Deepak Venugopal

TL;DR

The paper tackles the challenge of scaling strategy prediction in Adaptive Instructional Systems by exploiting symmetry in student mastery to cluster data non-parametrically. It introduces a mastery-based embedding, MVec, built on a Node2Vec-style relational graph of students, problems, and knowledge components, and uses a coarse-to-fine DP-Means clustering with a symmetry penalty to form strategy-invariant partitions. A one-to-many LSTM is trained on samples drawn from converged clusters to predict KC sequences, while an attention-based model estimates KC mastery to drive the embeddings. Experiments on large MATHia datasets BA08 and CL19 show that Attention Sampling achieves high accuracy with a small fraction of training data and maintains fairness across mastery levels, demonstrating scalable and robust strategy prediction for AIS deployment.

Abstract

Predicting the strategy (sequence of concepts) that a student is likely to use in problem-solving helps Adaptive Instructional Systems (AISs) better adapt themselves to different types of learners based on their learning abilities. This can lead to a more dynamic, engaging, and personalized experience for students. To scale up training a prediction model (such as LSTMs) over large-scale education datasets, we develop a non-parametric approach to cluster symmetric instances in the data. Specifically, we learn a representation based on Node2Vec that encodes symmetries over mastery or skill level since, to solve a problem, it is natural that a student's strategy is likely to involve concepts in which they have gained mastery. Using this representation, we use DP-Means to group symmetric instances through a coarse-to-fine refinement of the clusters. We apply our model to learn strategies for Math learning from large-scale datasets from MATHia, a leading AIS for middle-school math learning. Our results illustrate that our approach can consistently achieve high accuracy using a small sample that is representative of the full dataset. Further, we show that this approach helps us learn strategies with high accuracy for students at different skill levels, i.e., leveraging symmetries improves fairness in the prediction model.

Mastery Guided Non-parametric Clustering to Scale-up Strategy Prediction

TL;DR

The paper tackles the challenge of scaling strategy prediction in Adaptive Instructional Systems by exploiting symmetry in student mastery to cluster data non-parametrically. It introduces a mastery-based embedding, MVec, built on a Node2Vec-style relational graph of students, problems, and knowledge components, and uses a coarse-to-fine DP-Means clustering with a symmetry penalty to form strategy-invariant partitions. A one-to-many LSTM is trained on samples drawn from converged clusters to predict KC sequences, while an attention-based model estimates KC mastery to drive the embeddings. Experiments on large MATHia datasets BA08 and CL19 show that Attention Sampling achieves high accuracy with a small fraction of training data and maintains fairness across mastery levels, demonstrating scalable and robust strategy prediction for AIS deployment.

Abstract

Predicting the strategy (sequence of concepts) that a student is likely to use in problem-solving helps Adaptive Instructional Systems (AISs) better adapt themselves to different types of learners based on their learning abilities. This can lead to a more dynamic, engaging, and personalized experience for students. To scale up training a prediction model (such as LSTMs) over large-scale education datasets, we develop a non-parametric approach to cluster symmetric instances in the data. Specifically, we learn a representation based on Node2Vec that encodes symmetries over mastery or skill level since, to solve a problem, it is natural that a student's strategy is likely to involve concepts in which they have gained mastery. Using this representation, we use DP-Means to group symmetric instances through a coarse-to-fine refinement of the clusters. We apply our model to learn strategies for Math learning from large-scale datasets from MATHia, a leading AIS for middle-school math learning. Our results illustrate that our approach can consistently achieve high accuracy using a small sample that is representative of the full dataset. Further, we show that this approach helps us learn strategies with high accuracy for students at different skill levels, i.e., leveraging symmetries improves fairness in the prediction model.
Paper Structure (12 sections, 8 equations, 4 figures, 2 tables, 2 algorithms)

This paper contains 12 sections, 8 equations, 4 figures, 2 tables, 2 algorithms.

Figures (4)

  • Figure 1: (a) Example of a strategy used by a student to solve a math problem. (b) Different strategies to solve linear equations based on increasing levels of mastery.
  • Figure 2: Illustrating a graph network of three students, problems, and KCs. The figure on the right shows some of the sampled random walks/paths.
  • Figure 3: An example to illustrate the use of attention for mastery estimation. The bar charts show for each KC, the attention on a KC across steps that the student solves successfully (CFA=1) normalized by total attention for that KC. Larger values indicate that the model believes the student understands the KC as the attentions on it is large when CFA=1 and vice versa.
  • Figure 4: Illustrating Scalability vs Accuracy. (a), (b) show test accuracy for strategy prediction for varying training datasize limits. (c), (d) show accuracy (strategy prediction) for different training time limits. (e) shows accuracy for different groups of students (based on their performance). The x-axis denotes different ranges of %s, where a range $a-b$ denotes that students in this group got $>a$ and $<b$ steps correct in their first attempt. The y-axis shows accuracy over the groups.

Theorems & Definitions (2)

  • Definition 1
  • Definition 2