Table of Contents
Fetching ...

World Models for Math Story Problems

Andreas Opedal, Niklas Stoehr, Abulhair Saparov, Mrinmaya Sachan

TL;DR

This work introduces MathWorld, a graph-based semantic formalism for math story problems that captures dynamic world states with containers and relations. It provides an annotated corpus of $1{,}019$ MSPs and $3{,}204$ logical forms, and demonstrates three applications: interpretable problem solving via parsing and reasoning, probing LLMs with world-model prompts, and generating new MSPs conditioned on world models. The framework enables incremental parsing across sentences, cross-sentence semantics, and a first-order logic conversion, while offering equivalence and similarity metrics for evaluating world models. Despite promising uses, the paper reports current limitations in parsing accuracy, reliance on limited arithmetic types, and English-only data, outlining clear directions for improving parsers, expanding the formalism, and leveraging MSP generation for education and model analysis.

Abstract

Solving math story problems is a complex task for students and NLP models alike, requiring them to understand the world as described in the story and reason over it to compute an answer. Recent years have seen impressive performance on automatically solving these problems with large pre-trained language models and innovative techniques to prompt them. However, it remains unclear if these models possess accurate representations of mathematical concepts. This leads to lack of interpretability and trustworthiness which impedes their usefulness in various applications. In this paper, we consolidate previous work on categorizing and representing math story problems and develop MathWorld, which is a graph-based semantic formalism specific for the domain of math story problems. With MathWorld, we can assign world models to math story problems which represent the situations and actions introduced in the text and their mathematical relationships. We combine math story problems from several existing datasets and annotate a corpus of 1,019 problems and 3,204 logical forms with MathWorld. Using this data, we demonstrate the following use cases of MathWorld: (1) prompting language models with synthetically generated question-answer pairs to probe their reasoning and world modeling abilities, and (2) generating new problems by using the world models as a design space.

World Models for Math Story Problems

TL;DR

This work introduces MathWorld, a graph-based semantic formalism for math story problems that captures dynamic world states with containers and relations. It provides an annotated corpus of MSPs and logical forms, and demonstrates three applications: interpretable problem solving via parsing and reasoning, probing LLMs with world-model prompts, and generating new MSPs conditioned on world models. The framework enables incremental parsing across sentences, cross-sentence semantics, and a first-order logic conversion, while offering equivalence and similarity metrics for evaluating world models. Despite promising uses, the paper reports current limitations in parsing accuracy, reliance on limited arithmetic types, and English-only data, outlining clear directions for improving parsers, expanding the formalism, and leveraging MSP generation for education and model analysis.

Abstract

Solving math story problems is a complex task for students and NLP models alike, requiring them to understand the world as described in the story and reason over it to compute an answer. Recent years have seen impressive performance on automatically solving these problems with large pre-trained language models and innovative techniques to prompt them. However, it remains unclear if these models possess accurate representations of mathematical concepts. This leads to lack of interpretability and trustworthiness which impedes their usefulness in various applications. In this paper, we consolidate previous work on categorizing and representing math story problems and develop MathWorld, which is a graph-based semantic formalism specific for the domain of math story problems. With MathWorld, we can assign world models to math story problems which represent the situations and actions introduced in the text and their mathematical relationships. We combine math story problems from several existing datasets and annotate a corpus of 1,019 problems and 3,204 logical forms with MathWorld. Using this data, we demonstrate the following use cases of MathWorld: (1) prompting language models with synthetically generated question-answer pairs to probe their reasoning and world modeling abilities, and (2) generating new problems by using the world models as a design space.
Paper Structure (67 sections, 26 equations, 14 figures, 8 tables, 1 algorithm)

This paper contains 67 sections, 26 equations, 14 figures, 8 tables, 1 algorithm.

Figures (14)

  • Figure 1: An example of a world model in MathWorld. MathWorld can be used to develop interpretable MSP solvers, to study the reasoning of LLMs and as a design space for generation of new MSPs.
  • Figure 2: Synthetically created question-answer pairs based on templates. Note that the quantity in the container or relation does not need to be expressed in text, but could be a variable. Such cases test the model’s ability to reason over intermediate quantities.
  • Figure 3: Example MSPs generated by GPT 3.5 Turbo.
  • Figure 4: Example of a world model using Transfer.
  • Figure 5: Example of a world model using Transfer.
  • ...and 9 more figures