Table of Contents
Fetching ...

GFLean: An Autoformalisation Framework for Lean via GF

Shashank Pathak

TL;DR

GFLean tackles the challenge of autoformalising mathematical text into Lean input by combining Grammatical Framework with a Haskell-based processing pipeline to translate Simplified ForTheL statements into Lean expressions. The four-stage pipeline—parsing, AST simplification, AST transformation, and Lean linearization—provides a modular, grammar-driven path from informal mathematical statements to formal Lean code. Empirical results show that 42 of 62 statements from a standard textbook can be formalised with minor rephrasing, while remaining gaps highlight lexicon limitations and linguistic coverage. The work discusses limitations and future directions, including expanding the grammar and integrating with Lean to enable dynamic interaction, and argues for a hybrid approach that combines rule-based and neural translation for robustness.

Abstract

We present an autoformalisation framework for the Lean theorem prover, called GFLean. GFLean uses a high-level grammar writing tool called Grammatical Framework (GF) for parsing and linearisation. GFLean is implemented in Haskell. We explain the functionalities of GFLean, its inner working and discuss its limitations. We also discuss how we can use neural network based translation programs and rule based translation programs together complimenting each other to build robust autoformalisation frameworks.

GFLean: An Autoformalisation Framework for Lean via GF

TL;DR

GFLean tackles the challenge of autoformalising mathematical text into Lean input by combining Grammatical Framework with a Haskell-based processing pipeline to translate Simplified ForTheL statements into Lean expressions. The four-stage pipeline—parsing, AST simplification, AST transformation, and Lean linearization—provides a modular, grammar-driven path from informal mathematical statements to formal Lean code. Empirical results show that 42 of 62 statements from a standard textbook can be formalised with minor rephrasing, while remaining gaps highlight lexicon limitations and linguistic coverage. The work discusses limitations and future directions, including expanding the grammar and integrating with Lean to enable dynamic interaction, and argues for a hybrid approach that combines rule-based and neural translation for robustness.

Abstract

We present an autoformalisation framework for the Lean theorem prover, called GFLean. GFLean uses a high-level grammar writing tool called Grammatical Framework (GF) for parsing and linearisation. GFLean is implemented in Haskell. We explain the functionalities of GFLean, its inner working and discuss its limitations. We also discuss how we can use neural network based translation programs and rule based translation programs together complimenting each other to build robust autoformalisation frameworks.
Paper Structure (26 sections, 42 theorems, 7 equations, 3 figures)

This paper contains 26 sections, 42 theorems, 7 equations, 3 figures.

Key Result

Theorem 1

Let $x \in \textbf{R}$. If $x < 0$, then $x^2 + 1 > 0$.

Figures (3)

  • Figure 1: The AST produced by GF for a sentence from DemoEng.
  • Figure 2: The GFLean processing pipeline
  • Figure 3: The AST produced by GF after parsing Ex. 4 is not less than 3.

Theorems & Definitions (42)

  • Theorem 1: Result 3.1
  • Theorem 2: Result 3.2
  • Theorem 3: Exercise 3.1
  • Theorem 4: Exercise 3.3
  • Theorem 5: Exercise 3.4
  • Theorem 6: Exercise 3.6
  • Theorem 7: Exercise 3.7
  • Theorem 8: Result 3.4
  • Theorem 9: Result 3.5
  • Theorem 10: Result 3.6
  • ...and 32 more