Table of Contents
Fetching ...

Proving Olympiad Algebraic Inequalities without Human Demonstrations

Chenrui Wei, Mengzhou Sun, Wei Wang

TL;DR

This work tackles the challenge of solving Olympiad-level algebraic inequalities with AI by introducing AIPS, a system that fuses a symbolic deductive engine with a neural value network to navigate vast inequality spaces. It generates a large synthetic corpus of high-quality theorems (about 191k) through forward reasoning and pattern matching, and trains the value network with curriculum learning to guide proof search. On a 20-problem IMO-level inequality benchmark (MO-INT-20), AIPS solves 10 problems, outperforming state-of-the-art baselines and several large language models, with some synthetic theorems reaching IMO-level difficulty. AIPS not only solves problems but also produces human-readable proofs and a set of non-trivial theorems, including one selected for a major city’s Mathematical Olympiad, underscoring its potential to augment mathematical discovery while highlighting current limitations in autonomous definition generation.

Abstract

Solving Olympiad-level mathematical problems represents a significant advancement in machine intelligence and automated reasoning. Current machine learning methods, however, struggle to solve Olympiad-level problems beyond Euclidean plane geometry due to a lack of large-scale, high-quality datasets. The challenge is even greater in algebraic systems, which involve infinite reasoning spaces within finite conditions. To address these issues, we propose AIPS, an Algebraic Inequality Proving System capable of autonomously generating complex inequality theorems and effectively solving Olympiad-level inequality problems without requiring human demonstrations. During proof search in a mixed reasoning manner, a value curriculum learning strategy on generated datasets is implemented to improve proving performance, demonstrating strong mathematical intuitions. On a test set of 20 International Mathematical Olympiad-level inequality problems, AIPS successfully solved 10, outperforming state-of-the-art methods. Furthermore, AIPS automatically generated a vast array of non-trivial theorems without human intervention, some of which have been evaluated by professional contestants and deemed to reach the level of the International Mathematical Olympiad. Notably, one theorem was selected as a competition problem in a major city 2024 Mathematical Olympiad.

Proving Olympiad Algebraic Inequalities without Human Demonstrations

TL;DR

This work tackles the challenge of solving Olympiad-level algebraic inequalities with AI by introducing AIPS, a system that fuses a symbolic deductive engine with a neural value network to navigate vast inequality spaces. It generates a large synthetic corpus of high-quality theorems (about 191k) through forward reasoning and pattern matching, and trains the value network with curriculum learning to guide proof search. On a 20-problem IMO-level inequality benchmark (MO-INT-20), AIPS solves 10 problems, outperforming state-of-the-art baselines and several large language models, with some synthetic theorems reaching IMO-level difficulty. AIPS not only solves problems but also produces human-readable proofs and a set of non-trivial theorems, including one selected for a major city’s Mathematical Olympiad, underscoring its potential to augment mathematical discovery while highlighting current limitations in autonomous definition generation.

Abstract

Solving Olympiad-level mathematical problems represents a significant advancement in machine intelligence and automated reasoning. Current machine learning methods, however, struggle to solve Olympiad-level problems beyond Euclidean plane geometry due to a lack of large-scale, high-quality datasets. The challenge is even greater in algebraic systems, which involve infinite reasoning spaces within finite conditions. To address these issues, we propose AIPS, an Algebraic Inequality Proving System capable of autonomously generating complex inequality theorems and effectively solving Olympiad-level inequality problems without requiring human demonstrations. During proof search in a mixed reasoning manner, a value curriculum learning strategy on generated datasets is implemented to improve proving performance, demonstrating strong mathematical intuitions. On a test set of 20 International Mathematical Olympiad-level inequality problems, AIPS successfully solved 10, outperforming state-of-the-art methods. Furthermore, AIPS automatically generated a vast array of non-trivial theorems without human intervention, some of which have been evaluated by professional contestants and deemed to reach the level of the International Mathematical Olympiad. Notably, one theorem was selected as a competition problem in a major city 2024 Mathematical Olympiad.
Paper Structure (44 sections, 1 theorem, 139 equations, 15 figures, 2 tables, 2 algorithms)

This paper contains 44 sections, 1 theorem, 139 equations, 15 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

( AM-GM) For non-negative real numbers $a_1, a_2, \ldots, a_n$, with equality if and only if $a_1 = a_2 = \cdots = a_n$.

Figures (15)

  • Figure 1: Examples of expression trees and pattern matching for the AM-GM inequality are illustrated. In (a), for $x, y, z \geq 0$, the value of $\frac{xyz}{x+y+z}$ decreases as $x+y+z$ increases, so the label of the node $x+y+z$ is $-1$. By applying the AM-GM inequality, we derive a series of upper bounds with respect to the root, e.g., $\frac{(xyz)^{2/3}}{3}$ and $\frac{xyz}{2\sqrt{x(y+z)}}$. In (b), when traversing the expression tree of $\frac{1}{a+b} + \frac{1}{b+c} + \frac{1}{c+a}$, pattern matching for the AM-GM inequality at various nodes yields different types of bounds, such as the upper bound $\frac{1}{2\sqrt{ab}} + \frac{1}{2\sqrt{bc}} + \frac{1}{2\sqrt{ca}}$ and the lower bound $\frac{3}{((a+b)(b+c)(c+a))^\frac{1}{3}}$.
  • Figure 2: An example of generating synthetic theorems in AIPS. When the initial premise $\sqrt{a^2+2bc}+\sqrt{b^2+2ca}+\sqrt{c^2+2ab}$ successfully matches with Jensen's inequality, a new inequality is generated. By subsequently applying transformation rules and matching other fundamental inequalities, such as the AM-GM inequality, the deductive engine incrementally generates new inequality theorems. When an inequality theorem is applied, the system verifies whether the equality condition holds, e.g., $a=b=c$.
  • Figure 3: (a) Distribution of inference depths in our dataset. (b) Self-evolving process of AIPS.
  • Figure 4: Overview of how AIPS proves a simple theorem. At each step, the deductive engine attempts to match inequality theorems with each side of the goal and applies all transformation rules to the expression, resulting in a list of new subgoals. The searched goal is placed into a closed list, ensuring that it will not be examined again. If one of the new subgoals is true, indicating that the inequality holds, then the theorem is proved. Otherwise, the new subgoals are added to the open list, along with other subgoals generated previously. A value network then evaluates all subgoals in the open list, and the top-value one is chosen for the next iteration of proof search.
  • Figure 5: Two examples of forward reasoning on the left and backward reasoning on the right.
  • ...and 10 more figures

Theorems & Definitions (1)

  • Theorem 1