Table of Contents
Fetching ...

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

Kumar Shridhar, Jakub Macina, Mennatallah El-Assady, Tanmay Sinha, Manu Kapur, Mrinmaya Sachan

TL;DR

This paper tackles generating Socratic subquestions to teach math word problems by training a transformer-based question generator augmented with a content planner and reinforcement learning. Sub-questions are guided by focused content (operators/equations) and goal-driven rewards (fluency, granularity, answerability), trained and evaluated on GSM8K. Empirical results show improved Math QA solver performance and higher-quality questions from both automatic and human evaluations, though educational impact varies with problem difficulty and learner familiarity. The study highlights the potential and challenges of deploying automated Socratic questioning to support both AI problem solving and human learning, outlining directions for more robust future work.

Abstract

Socratic questioning is an educational method that allows students to discover answers to complex problems by asking them a series of thoughtful questions. Generation of didactically sound questions is challenging, requiring understanding of the reasoning process involved in the problem. We hypothesize that such questioning strategy can not only enhance the human performance, but also assist the math word problem (MWP) solvers. In this work, we explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving. We propose various guided question generation schemes based on input conditioning and reinforcement learning. On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver. We conduct a preliminary user study to examine the potential value of such question generation models in the education domain. Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance. We discuss the future of using such questioning strategies in education.

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

TL;DR

This paper tackles generating Socratic subquestions to teach math word problems by training a transformer-based question generator augmented with a content planner and reinforcement learning. Sub-questions are guided by focused content (operators/equations) and goal-driven rewards (fluency, granularity, answerability), trained and evaluated on GSM8K. Empirical results show improved Math QA solver performance and higher-quality questions from both automatic and human evaluations, though educational impact varies with problem difficulty and learner familiarity. The study highlights the potential and challenges of deploying automated Socratic questioning to support both AI problem solving and human learning, outlining directions for more robust future work.

Abstract

Socratic questioning is an educational method that allows students to discover answers to complex problems by asking them a series of thoughtful questions. Generation of didactically sound questions is challenging, requiring understanding of the reasoning process involved in the problem. We hypothesize that such questioning strategy can not only enhance the human performance, but also assist the math word problem (MWP) solvers. In this work, we explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving. We propose various guided question generation schemes based on input conditioning and reinforcement learning. On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver. We conduct a preliminary user study to examine the potential value of such question generation models in the education domain. Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance. We discuss the future of using such questioning strategies in education.
Paper Structure (36 sections, 4 equations, 6 figures, 9 tables)

This paper contains 36 sections, 4 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: Math word problems can be precedurally solved in multiple reasoning steps. One operationalization of Socratic questioning is to map each step in the procedure to a question. Asking (machines/humans) the right set of questions in a certain sequence (shown in green) can be an effective way to do so. In order to be effective, the Socratic questioning should be focused and goal-driven.
  • Figure 2: Our overall methodology: Two Socratic properties of focused (red dotted box) and goal-driven (green dotted box) question generation are added to the question generation model with a combination of content planning and reward based finetuning. Here, $\oplus$ represents the concatenation operation.
  • Figure 3: Comparison of baseline versus our model generated sub-questions on several metrics from our human evaluations (showing mean and standard deviation).
  • Figure 4: Second submission success rate for problems with at least 10% occurrence for each group (excluding the two simplest problems 1 and 6). Difficulty level is annotated blind to the correct solution.
  • Figure 5: Interface for our user study (cf. Section 6). For each problem, the first screen contains the MWP text, a calculator, and an input box to submit the answer.
  • ...and 1 more figures