Table of Contents
Fetching ...

Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning

Zenan Li, Zhaoyu Li, Wen Tang, Xian Zhang, Yuan Yao, Xujie Si, Fan Yang, Kaiyu Yang, Xiaoxing Ma

TL;DR

This paper tackles the challenge of proving Olympiad-style inequalities by integrating neural and symbolic reasoning in a system called Lips. It partitions proof generation into two tactic families—scaling (symbolic) and rewriting (LLM-driven)—and uses a two-stage goal selection pipeline with symbolic filtering (homogeneity/decoupling) and neural ranking (chain-of-thought prompts) to efficiently search for subgoals. The approach yields human-readable, Lean-verifiable proofs and achieves state-of-the-art results on 161 inequalities, significantly outperforming both purely symbolic and purely neural baselines while avoiding additional training data. The work demonstrates the practicality and impact of neuro-symbolic integration for formal mathematical reasoning and points to pathways for broader domain coverage and automated tactic discovery.

Abstract

Large language models (LLMs) can prove mathematical theorems formally by generating proof steps (\textit{a.k.a.} tactics) within a proof system. However, the space of possible tactics is vast and complex, while the available training data for formal proofs is limited, posing a significant challenge to LLM-based tactic generation. To address this, we introduce a neuro-symbolic tactic generator that synergizes the mathematical intuition learned by LLMs with domain-specific insights encoded by symbolic methods. The key aspect of this integration is identifying which parts of mathematical reasoning are best suited to LLMs and which to symbolic methods. While the high-level idea of neuro-symbolic integration is broadly applicable to various mathematical problems, in this paper, we focus specifically on Olympiad inequalities (Figure~1). We analyze how humans solve these problems and distill the techniques into two types of tactics: (1) scaling, handled by symbolic methods, and (2) rewriting, handled by LLMs. In addition, we combine symbolic tools with LLMs to prune and rank the proof goals for efficient proof search. We evaluate our framework on 161 challenging inequalities from multiple mathematics competitions, achieving state-of-the-art performance and significantly outperforming existing LLM and symbolic approaches without requiring additional training data.

Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning

TL;DR

This paper tackles the challenge of proving Olympiad-style inequalities by integrating neural and symbolic reasoning in a system called Lips. It partitions proof generation into two tactic families—scaling (symbolic) and rewriting (LLM-driven)—and uses a two-stage goal selection pipeline with symbolic filtering (homogeneity/decoupling) and neural ranking (chain-of-thought prompts) to efficiently search for subgoals. The approach yields human-readable, Lean-verifiable proofs and achieves state-of-the-art results on 161 inequalities, significantly outperforming both purely symbolic and purely neural baselines while avoiding additional training data. The work demonstrates the practicality and impact of neuro-symbolic integration for formal mathematical reasoning and points to pathways for broader domain coverage and automated tactic discovery.

Abstract

Large language models (LLMs) can prove mathematical theorems formally by generating proof steps (\textit{a.k.a.} tactics) within a proof system. However, the space of possible tactics is vast and complex, while the available training data for formal proofs is limited, posing a significant challenge to LLM-based tactic generation. To address this, we introduce a neuro-symbolic tactic generator that synergizes the mathematical intuition learned by LLMs with domain-specific insights encoded by symbolic methods. The key aspect of this integration is identifying which parts of mathematical reasoning are best suited to LLMs and which to symbolic methods. While the high-level idea of neuro-symbolic integration is broadly applicable to various mathematical problems, in this paper, we focus specifically on Olympiad inequalities (Figure~1). We analyze how humans solve these problems and distill the techniques into two types of tactics: (1) scaling, handled by symbolic methods, and (2) rewriting, handled by LLMs. In addition, we combine symbolic tools with LLMs to prune and rank the proof goals for efficient proof search. We evaluate our framework on 161 challenging inequalities from multiple mathematics competitions, achieving state-of-the-art performance and significantly outperforming existing LLM and symbolic approaches without requiring additional training data.

Paper Structure

This paper contains 22 sections, 34 equations, 7 figures, 4 tables, 3 algorithms.

Figures (7)

  • Figure 1: We prove inequality problems in math Olympiads that involve a finite number of real variables, hypotheses, and one conclusion. Both the hypotheses and the conclusion consist of constants, variables, algebraic operations (e.g., addition, multiplication), and transcendental functions like exp.
  • Figure 2: An overview of our neuro-symbolic inequality prover Lips. By integrating both LLMs and symbolic methods in an iterative process of tactic generation and goal selection, it can generate human-readable and formally verifiable proofs in Lean for Olympiad-level inequality problems.
  • Figure 3: Tactic pruning ratio ($\uparrow$, higher is better) and number of iterations ($\downarrow$). The results illustrate that the tactic pruning method of Lips is very stable, and outperforms the existing method by 7.92% on average. Furthermore, the high efficiency of Lips is derived from accurate goal selection, allowing a proof to be successfully constructed with a small number of goal selection iterations.
  • Figure 4: Figure (a) demonstrates how CAD is performed on two intersecting unit circles, deriving multiple sign-invariant cells (i.e., colored region). Figure (b) illustrates the process of inequality proving, which constructs a chain of proof goals by iteratively applying tactics; Figure (c) provides a corresponding instantiation of proving $ab+bc+ca \leq a ^ 2 + b ^ 2 + c ^ 2$.
  • Figure 5: Two examples of AM-GM inequality encoded as scaling tactics
  • ...and 2 more figures