Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation

Zhenwen Liang; Wenhao Yu; Tanmay Rajpurohit; Peter Clark; Xiangliang Zhang; Ashwin Kaylan

Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation

Zhenwen Liang, Wenhao Yu, Tanmay Rajpurohit, Peter Clark, Xiangliang Zhang, Ashwin Kaylan

TL;DR

This work introduces CEMAL, a knowledge-tracing–driven framework that distills math word problem solving from large language models into smaller, efficient student solvers by generating customized exercises. It integrates an iterative training workflow, an LLM-guided exercise book, and targeted data augmentation to address the learner's weaknesses, achieving state-of-the-art results on MAWPS and ASDiv-a with far fewer parameters and robust performance on SVAMP under ID and OOD settings. Key findings include the superiority of targeted generation over random strategies, the value of a sizable exercise book for robust validation, and the benefits of progressive augmentation over one-shot data expansion. The approach offers practical implications for educational AI by aligning distillation with knowledge tracing and personalized learning, though it requires careful prompt design and quality control of generated content.

Abstract

In this paper, we present a novel approach for distilling math word problem solving capabilities from large language models (LLMs) into smaller, more efficient student models. Our approach is designed to consider the student model's weaknesses and foster a tailored learning experience by generating targeted exercises aligned with educational science principles, such as knowledge tracing and personalized learning. Concretely, we let GPT-3 be a math tutor and run two steps iteratively: 1) assessing the student model's current learning status on a GPT-generated exercise book, and 2) improving the student model by training it with tailored exercise samples generated by GPT-3. Experimental results reveal that our approach outperforms LLMs (e.g., GPT-3 and PaLM) in accuracy across three distinct benchmarks while employing significantly fewer parameters. Furthermore, we provide a comprehensive analysis of the various components within our methodology to substantiate their efficacy.

Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation

TL;DR

Abstract

Paper Structure (22 sections, 3 figures, 4 tables, 1 algorithm)

This paper contains 22 sections, 3 figures, 4 tables, 1 algorithm.

Introduction
Related Work
Math Word Problem Solving
Large Language Models for Knowledge Distillation and Data Generation
Approach
Problem Definition
Training Workflow
Backbone of Student Solver
Exercise Generation
Targeted Generation vs. Random Generation
Experiments
Datasets
MAWPS
ASDiv-a
SVAMP
...and 7 more sections

Figures (3)

Figure 1: Accuracies vs model sizes for representative baselines and our approach on SVAMP dataset. Our method achieves competitive performance with LLMs with significantly fewer parameters.
Figure 2: This figure shows the overall iterative framework of CEMAL. After one round of training, the student, which is a small MWP solver, is evaluated by exercises provided by an LLM teacher. Subsequently, LLM generates customized exercises that target the student's knowledge state and weaknesses, thereby facilitating a customized improvement in their overall performance.
Figure 3: Performance Comparison between one-time augmentation and progressive augmentation on SVAMP under out-of-distribution setting.

Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation

TL;DR

Abstract

Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation

Authors

TL;DR

Abstract

Table of Contents

Figures (3)