Proving Olympiad Algebraic Inequalities without Human Demonstrations

Chenrui Wei; Mengzhou Sun; Wei Wang

Proving Olympiad Algebraic Inequalities without Human Demonstrations

Chenrui Wei, Mengzhou Sun, Wei Wang

TL;DR

This work tackles the challenge of solving Olympiad-level algebraic inequalities with AI by introducing AIPS, a system that fuses a symbolic deductive engine with a neural value network to navigate vast inequality spaces. It generates a large synthetic corpus of high-quality theorems (about 191k) through forward reasoning and pattern matching, and trains the value network with curriculum learning to guide proof search. On a 20-problem IMO-level inequality benchmark (MO-INT-20), AIPS solves 10 problems, outperforming state-of-the-art baselines and several large language models, with some synthetic theorems reaching IMO-level difficulty. AIPS not only solves problems but also produces human-readable proofs and a set of non-trivial theorems, including one selected for a major city’s Mathematical Olympiad, underscoring its potential to augment mathematical discovery while highlighting current limitations in autonomous definition generation.

Abstract

Solving Olympiad-level mathematical problems represents a significant advancement in machine intelligence and automated reasoning. Current machine learning methods, however, struggle to solve Olympiad-level problems beyond Euclidean plane geometry due to a lack of large-scale, high-quality datasets. The challenge is even greater in algebraic systems, which involve infinite reasoning spaces within finite conditions. To address these issues, we propose AIPS, an Algebraic Inequality Proving System capable of autonomously generating complex inequality theorems and effectively solving Olympiad-level inequality problems without requiring human demonstrations. During proof search in a mixed reasoning manner, a value curriculum learning strategy on generated datasets is implemented to improve proving performance, demonstrating strong mathematical intuitions. On a test set of 20 International Mathematical Olympiad-level inequality problems, AIPS successfully solved 10, outperforming state-of-the-art methods. Furthermore, AIPS automatically generated a vast array of non-trivial theorems without human intervention, some of which have been evaluated by professional contestants and deemed to reach the level of the International Mathematical Olympiad. Notably, one theorem was selected as a competition problem in a major city 2024 Mathematical Olympiad.

Proving Olympiad Algebraic Inequalities without Human Demonstrations

TL;DR

Abstract

Paper Structure (44 sections, 1 theorem, 139 equations, 15 figures, 2 tables, 2 algorithms)

This paper contains 44 sections, 1 theorem, 139 equations, 15 figures, 2 tables, 2 algorithms.

Introduction
Related Work
Algebraic Inequality Proving System
Symbolic Deductive Engine for Algebra
Representation for Algebraic Expressions and Theorems
Pattern Matching for Inequality Theorems
Forward Reasoning
Olympiad-Level Inequality Proof Set
Synthetic Theorem Generation
Synthetic Theorem Evaluation
Neural Algebraic Inequality Prover
Searching Proofs by Combining Value Network with Symbolic Prover
Pre-training Value Network Using a Heuristic Function
Fine-tuning Value Network on Filtered Synthetic Data
Experiments
...and 29 more sections

Key Result

Theorem 1

( AM-GM) For non-negative real numbers $a_1, a_2, \ldots, a_n$, with equality if and only if $a_1 = a_2 = \cdots = a_n$.

Figures (15)

Figure 1: Examples of expression trees and pattern matching for the AM-GM inequality are illustrated. In (a), for $x, y, z \geq 0$, the value of $\frac{xyz}{x+y+z}$ decreases as $x+y+z$ increases, so the label of the node $x+y+z$ is $-1$. By applying the AM-GM inequality, we derive a series of upper bounds with respect to the root, e.g., $\frac{(xyz)^{2/3}}{3}$ and $\frac{xyz}{2\sqrt{x(y+z)}}$. In (b), when traversing the expression tree of $\frac{1}{a+b} + \frac{1}{b+c} + \frac{1}{c+a}$, pattern matching for the AM-GM inequality at various nodes yields different types of bounds, such as the upper bound $\frac{1}{2\sqrt{ab}} + \frac{1}{2\sqrt{bc}} + \frac{1}{2\sqrt{ca}}$ and the lower bound $\frac{3}{((a+b)(b+c)(c+a))^\frac{1}{3}}$.
Figure 2: An example of generating synthetic theorems in AIPS. When the initial premise $\sqrt{a^2+2bc}+\sqrt{b^2+2ca}+\sqrt{c^2+2ab}$ successfully matches with Jensen's inequality, a new inequality is generated. By subsequently applying transformation rules and matching other fundamental inequalities, such as the AM-GM inequality, the deductive engine incrementally generates new inequality theorems. When an inequality theorem is applied, the system verifies whether the equality condition holds, e.g., $a=b=c$.
Figure 3: (a) Distribution of inference depths in our dataset. (b) Self-evolving process of AIPS.
Figure 4: Overview of how AIPS proves a simple theorem. At each step, the deductive engine attempts to match inequality theorems with each side of the goal and applies all transformation rules to the expression, resulting in a list of new subgoals. The searched goal is placed into a closed list, ensuring that it will not be examined again. If one of the new subgoals is true, indicating that the inequality holds, then the theorem is proved. Otherwise, the new subgoals are added to the open list, along with other subgoals generated previously. A value network then evaluates all subgoals in the open list, and the top-value one is chosen for the next iteration of proof search.
Figure 5: Two examples of forward reasoning on the left and backward reasoning on the right.
...and 10 more figures

Theorems & Definitions (1)

Theorem 1

Proving Olympiad Algebraic Inequalities without Human Demonstrations

TL;DR

Abstract

Proving Olympiad Algebraic Inequalities without Human Demonstrations

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (1)