Models Can and Should Embrace the Communicative Nature of Human-Generated Math

Sasha Boguraev; Ben Lipkin; Leonie Weissweiler; Kyle Mahowald

Models Can and Should Embrace the Communicative Nature of Human-Generated Math

Sasha Boguraev, Ben Lipkin, Leonie Weissweiler, Kyle Mahowald

Abstract

Math is constructed by people for people: just as natural language corpora reflect not just propositions but the communicative goals of language users, the math data that models are trained on reflects not just idealized mathematical entities but rich communicative intentions. While there are important advantages to treating math in a purely symbolic manner, we here hypothesize that there are benefits to treating math as situated linguistic communication and that language models are well suited for this goal, in ways that are not fully appreciated. We illustrate these points with two case studies. First, we ran an experiment in which we found that language models interpret the equals sign in a humanlike way -- generating systematically different word problems for the same underlying equation arranged in different ways. Second, we found that language models prefer proofs to be ordered in naturalistic ways, even though other orders would be logically equivalent. We advocate for AI systems that learn from and represent the communicative intentions latent in human-generated math.

Models Can and Should Embrace the Communicative Nature of Human-Generated Math

Abstract

Paper Structure (62 sections, 4 theorems, 42 equations, 2 figures)

This paper contains 62 sections, 4 theorems, 42 equations, 2 figures.

Introduction
Case Study One: Equations are Asymmetric
Methods
Results and Discussion
Case Study Two: Mathematical Rules and Proofs Have Orders
Methods
Equation Variants
Results and Discussion
Practical Applications
Math Education
Math Research
Conclusion
Equation Generation and Prompting in Case Study One
Equation Generation
Prompting Methods
...and 47 more sections

Key Result

Theorem 1

Suppose $\langle S,\star\rangle$ and $\langle S',\star'\rangle$ be binary algebraic structures, and $\phi$ is an isomorphism from $\langle S,\star\rangle$ onto $\langle S',\star'\rangle$. Further suppose that $e$ is a left identity element in $\langle S,\star\rangle$. Then $\phi(e)$ is a left identi

Figures (2)

Figure 1: For each pair of equations, we generate corresponding word problems and then try to recover the equations from those problems. The model often recovered the original ordering.
Figure 2: We compare average per-token surprisal for different, logically equivalent orderings of expressions in proofs from mirin2022mathematicians (first row), and corresponding variants (second through fourth row). We find that the original order ( ) has lower per-token surprisals on average (more probable) than equivalent counterfactual orders.

Theorems & Definitions (12)

Theorem 1
proof
proof
Theorem 2
proof
proof
Theorem 3
proof
proof
Theorem 4
...and 2 more

Models Can and Should Embrace the Communicative Nature of Human-Generated Math

Abstract

Models Can and Should Embrace the Communicative Nature of Human-Generated Math

Authors

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (12)