MathBERT: A Pre-trained Language Model for General NLP Tasks in Mathematics Education
Jia Tracy Shen, Michiharu Yamashita, Ethan Prihar, Neil Heffernan, Xintao Wu, Ben Graff, Dongwon Lee
TL;DR
MathBERT introduces a domain-adapted BERT model pre-trained on a large, diverse mathematics corpus (pre-k to graduate content) to improve NLP tasks in mathematics education. By constructing a math-focused vocabulary (mathVocab) and evaluating on three core tasks—knowledge component prediction, auto-grading of open responses, and knowledge tracing—the model demonstrates consistent gains over BASE BERT and competitive gains over prior state-of-the-art methods. The approach includes both task-adaptive pre-training (TAPT) and domain-adaptive pre-training (DAPT) strategies, with extensive experiments on ASSISTments data and real-world deployment in ASSISTments and K12.com. Key findings show MathBERT achieving 1.98%–8.28% improvements over BASE BERT and up to 22.01% over prior best methods, with mathVocab offering additional benefits in many settings. The work highlights practical applicability, providing public artifacts and demonstrating potential to enhance automatic scoring, feedback, and skill-guided content recommendations in math education.
Abstract
Since the introduction of the original BERT (i.e., BASE BERT), researchers have developed various customized BERT models with improved performance for specific domains and tasks by exploiting the benefits of transfer learning. Due to the nature of mathematical texts, which often use domain specific vocabulary along with equations and math symbols, we posit that the development of a new BERT model for mathematics would be useful for many mathematical downstream tasks. In this resource paper, we introduce our multi-institutional effort (i.e., two learning platforms and three academic institutions in the US) toward this need: MathBERT, a model created by pre-training the BASE BERT model on a large mathematical corpus ranging from pre-kindergarten (pre-k), to high-school, to college graduate level mathematical content. In addition, we select three general NLP tasks that are often used in mathematics education: prediction of knowledge component, auto-grading open-ended Q&A, and knowledge tracing, to demonstrate the superiority of MathBERT over BASE BERT. Our experiments show that MathBERT outperforms prior best methods by 1.2-22% and BASE BERT by 2-8% on these tasks. In addition, we build a mathematics specific vocabulary 'mathVocab' to train with MathBERT. We discover that MathBERT pre-trained with 'mathVocab' outperforms MathBERT trained with the BASE BERT vocabulary (i.e., 'origVocab'). MathBERT is currently being adopted at the participated leaning platforms: Stride, Inc, a commercial educational resource provider, and ASSISTments.org, a free online educational platform. We release MathBERT for public usage at: https://github.com/tbs17/MathBERT.
