Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning
Zenan Li, Zhaoyu Li, Wen Tang, Xian Zhang, Yuan Yao, Xujie Si, Fan Yang, Kaiyu Yang, Xiaoxing Ma
TL;DR
This paper tackles the challenge of proving Olympiad-style inequalities by integrating neural and symbolic reasoning in a system called Lips. It partitions proof generation into two tactic families—scaling (symbolic) and rewriting (LLM-driven)—and uses a two-stage goal selection pipeline with symbolic filtering (homogeneity/decoupling) and neural ranking (chain-of-thought prompts) to efficiently search for subgoals. The approach yields human-readable, Lean-verifiable proofs and achieves state-of-the-art results on 161 inequalities, significantly outperforming both purely symbolic and purely neural baselines while avoiding additional training data. The work demonstrates the practicality and impact of neuro-symbolic integration for formal mathematical reasoning and points to pathways for broader domain coverage and automated tactic discovery.
Abstract
Large language models (LLMs) can prove mathematical theorems formally by generating proof steps (\textit{a.k.a.} tactics) within a proof system. However, the space of possible tactics is vast and complex, while the available training data for formal proofs is limited, posing a significant challenge to LLM-based tactic generation. To address this, we introduce a neuro-symbolic tactic generator that synergizes the mathematical intuition learned by LLMs with domain-specific insights encoded by symbolic methods. The key aspect of this integration is identifying which parts of mathematical reasoning are best suited to LLMs and which to symbolic methods. While the high-level idea of neuro-symbolic integration is broadly applicable to various mathematical problems, in this paper, we focus specifically on Olympiad inequalities (Figure~1). We analyze how humans solve these problems and distill the techniques into two types of tactics: (1) scaling, handled by symbolic methods, and (2) rewriting, handled by LLMs. In addition, we combine symbolic tools with LLMs to prune and rank the proof goals for efficient proof search. We evaluate our framework on 161 challenging inequalities from multiple mathematics competitions, achieving state-of-the-art performance and significantly outperforming existing LLM and symbolic approaches without requiring additional training data.
