Table of Contents
Fetching ...

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

Boyan Duan, Xiao Liang, Shuai Lu, Yaoxiang Wang, Yelong Shen, Kai-Wei Chang, Ying Nian Wu, Mao Yang, Weizhu Chen, Yeyun Gong

TL;DR

The paper addresses the challenge of IMO-level automated geometry theorem proving by presenting HAGeo, a CPU-only, neural-free framework that uses efficient deduction (DD) and algebraic reasoning (AR) augmented by heuristic auxiliary constructions. It demonstrates gold-medal performance on the IMO-30 benchmark and achieves substantial speedups over neural approaches, while introducing HAGeo-409, a larger, more challenging benchmark with human-difficulty annotations. Key contributions include an optimized geometry-specific language, a faster DDAR engine, and a principled auxiliary-point heuristic that yields robust performance across problem difficulty. These advances offer a scalable, GPU-free path for geometry theorem proving and provide a rigorous benchmark to drive future improvements in the field.

Abstract

Automated theorem proving in Euclidean geometry, particularly for International Mathematical Olympiad (IMO) level problems, remains a major challenge and an important research focus in Artificial Intelligence. In this paper, we present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference. Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on IMO. Building on this, we propose HAGeo, a Heuristic-based method for adding Auxiliary constructions in Geometric deduction that solves 28 of 30 problems on the IMO-30 benchmark, achieving gold-medal level performance and surpassing AlphaGeometry, a competitive neural network-based approach, by a notable margin. To evaluate our method and existing approaches more comprehensively, we further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels. Compared with the widely used IMO-30, our benchmark poses greater challenges and provides a more precise evaluation, setting a higher bar for geometry theorem proving.

Gold-Medal-Level Olympiad Geometry Solving with Efficient Heuristic Auxiliary Constructions

TL;DR

The paper addresses the challenge of IMO-level automated geometry theorem proving by presenting HAGeo, a CPU-only, neural-free framework that uses efficient deduction (DD) and algebraic reasoning (AR) augmented by heuristic auxiliary constructions. It demonstrates gold-medal performance on the IMO-30 benchmark and achieves substantial speedups over neural approaches, while introducing HAGeo-409, a larger, more challenging benchmark with human-difficulty annotations. Key contributions include an optimized geometry-specific language, a faster DDAR engine, and a principled auxiliary-point heuristic that yields robust performance across problem difficulty. These advances offer a scalable, GPU-free path for geometry theorem proving and provide a rigorous benchmark to drive future improvements in the field.

Abstract

Automated theorem proving in Euclidean geometry, particularly for International Mathematical Olympiad (IMO) level problems, remains a major challenge and an important research focus in Artificial Intelligence. In this paper, we present a highly efficient method for geometry theorem proving that runs entirely on CPUs without relying on neural network-based inference. Our initial study shows that a simple random strategy for adding auxiliary points can achieve silver-medal level human performance on IMO. Building on this, we propose HAGeo, a Heuristic-based method for adding Auxiliary constructions in Geometric deduction that solves 28 of 30 problems on the IMO-30 benchmark, achieving gold-medal level performance and surpassing AlphaGeometry, a competitive neural network-based approach, by a notable margin. To evaluate our method and existing approaches more comprehensively, we further construct HAGeo-409, a benchmark consisting of 409 geometry problems with human-assessed difficulty levels. Compared with the widely used IMO-30, our benchmark poses greater challenges and provides a more precise evaluation, setting a higher bar for geometry theorem proving.

Paper Structure

This paper contains 27 sections, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Problem difficulity distribution of the IMO-30 benchmark and our new HAGeo-409 benchmark.
  • Figure 2: Overview of the HAGeo method. First, the DDAR engine deduces new statements in the problem. If the DDAR does not solve the problem, our heuristic-based strategy gives additional attempts for adds auxiliary constructions to help solve the problem and re-runs the DDAR.
  • Figure 3: Pipeline for adding heuristic auxiliary points. Each complete auxiliary construction consists of up to $N$ rounds of auxiliary point generation. In each round, the system selects one valid auxiliary point from all possible candidates determined through algebraical computation.
  • Figure 4: The heuristics proposed in HAGeo. The paired visualization is present in Figure \ref{['fig:aux_example']}. Each heuristic is exemplified using a distinct color, which are aligned with the lines and points shown in Figure \ref{['fig:aux_example']}.
  • Figure 5: Illustration of heuristic auxiliary points. Geometry configuration: $I,G,O,H$ are the incenter, centroid, circumcenter, and orthocenter of $\triangle ABC$; $L,M,N$ are the midpoints of $BC,CA,AB$; and $D,E,F$ are the tangent points of the incircle $(I)$ with sides $BC,CA,AB$.
  • ...and 2 more figures