Predicting Learning Performance with Large Language Models: A Study in Adult Literacy

Liang Zhang; Jionghao Lin; Conrad Borchers; John Sabatini; John Hollander; Meng Cao; Xiangen Hu

Predicting Learning Performance with Large Language Models: A Study in Adult Literacy

Liang Zhang, Jionghao Lin, Conrad Borchers, John Sabatini, John Hollander, Meng Cao, Xiangen Hu

TL;DR

The paper addresses predicting learning performance in adult literacy within Intelligent Tutoring Systems by pairing GPT-4 with traditional ML—specifically using an encoding/decoding framework that prompts GPT-4 to select and tune models such as XGBoost. Across CSAL AutoTutor data and via $5$-fold cross-validation, the GPT-4–enabled approach achieves competitive, often superior, predictive accuracy compared with BKT, PFA, SPARFA-Lite, and Tensor Factorization, with the GPT-4–selected XGBoost configuration delivering the best results and further improvements when run on the GPT-4 platform. Hyperparameter tuning by GPT-4 is comparable to manual grid search but tends to exhibit greater variability, indicating a trade-off between automation and stability. The study demonstrates the potential of integrating LLMs with established ML techniques to enhance personalization in adult literacy education and lays groundwork for future LLM-driven learner modeling and knowledge tracing in ITS.

Abstract

Intelligent Tutoring Systems (ITSs) have significantly enhanced adult literacy training, a key factor for societal participation, employment opportunities, and lifelong learning. Our study investigates the application of advanced AI models, including Large Language Models (LLMs) like GPT-4, for predicting learning performance in adult literacy programs in ITSs. This research is motivated by the potential of LLMs to predict learning performance based on its inherent reasoning and computational capabilities. By using reading comprehension datasets from the ITS, AutoTutor, we evaluate the predictive capabilities of GPT-4 versus traditional machine learning methods in predicting learning performance through five-fold cross-validation techniques. Our findings show that the GPT-4 presents the competitive predictive abilities with traditional machine learning methods such as Bayesian Knowledge Tracing, Performance Factor Analysis, Sparse Factor Analysis Lite (SPARFA-Lite), tensor factorization and eXtreme Gradient Boosting (XGBoost). While XGBoost (trained on local machine) outperforms GPT-4 in predictive accuracy, GPT-4-selected XGBoost and its subsequent tuning on the GPT-4 platform demonstrates superior performance compared to local machine execution. Moreover, our investigation into hyper-parameter tuning by GPT-4 versus grid-search suggests comparable performance, albeit with less stability in the automated approach, using XGBoost as the case study. Our study contributes to the field by highlighting the potential of integrating LLMs with traditional machine learning models to enhance predictive accuracy and personalize adult literacy education, setting a foundation for future research in applying LLMs within ITSs.

Predicting Learning Performance with Large Language Models: A Study in Adult Literacy

TL;DR

-fold cross-validation, the GPT-4–enabled approach achieves competitive, often superior, predictive accuracy compared with BKT, PFA, SPARFA-Lite, and Tensor Factorization, with the GPT-4–selected XGBoost configuration delivering the best results and further improvements when run on the GPT-4 platform. Hyperparameter tuning by GPT-4 is comparable to manual grid search but tends to exhibit greater variability, indicating a trade-off between automation and stability. The study demonstrates the potential of integrating LLMs with established ML techniques to enhance personalization in adult literacy education and lays groundwork for future LLM-driven learner modeling and knowledge tracing in ITS.

Abstract

Paper Structure (22 sections, 2 figures, 3 tables)

This paper contains 22 sections, 2 figures, 3 tables.

Introduction
Related Work
Adult Literacy Education in Intelligent Tutoring Systems
Learning Performance Prediction
Large Language Models in Education
Methods
Dataset
The Proposed LLM-based Prediction Method
Baseline Methods
Evaluation
Results
Results on RQ1
Results on RQ2
Discussions
Efficient LLM-based Method for Predicting Learning Performance
...and 7 more sections

Figures (2)

Figure 1: The interface of AutoTutor for the Center for the Study of Adult Literacy.
Figure 2: LLM-based prediction framework for learner learning performance.

Predicting Learning Performance with Large Language Models: A Study in Adult Literacy

TL;DR

Abstract

Predicting Learning Performance with Large Language Models: A Study in Adult Literacy

Authors

TL;DR

Abstract

Table of Contents

Figures (2)