Table of Contents
Fetching ...

MerLean: An Agentic Framework for Autoformalization in Quantum Computation

Yuanjie Ren, Jinzheng Li, Yidi Qi

TL;DR

MerLean addresses verification bottlenecks in frontier mathematical physics by automatically formalizing LaTeX statements into machine-checked Lean 4 code. It uses a bidirectional, agentic workflow where a frontier LLM translates LaTeX into Lean, then a second pass autoinformalizes the verified Lean back into LaTeX for semantic review. On three quantum computing papers, MerLean produced 114 statements and 2,050 Lean declarations, introducing necessary definitions and axioms when Mathlib lacked corresponding results, and completing end-to-end formalization within under 42 hours. The approach demonstrates practical utility for machine-verified peer review and data generation to train future reasoning models, with potential to generalize to other rigorous domains in mathematics and physics.

Abstract

We introduce MerLean, a fully automated agentic framework for autoformalization in quantum computation. MerLean extracts mathematical statements from \LaTeX{} source files, formalizes them into verified Lean~4 code built on Mathlib, and translates the result back into human-readable \LaTeX{} for semantic review. We evaluate MerLean on three theoretical quantum computing papers producing 2,050 Lean declarations from 114 statements in total. MerLean achieves end-to-end formalization on all three papers, reducing the verification burden to only the newly introduced definitions and axioms. Our results demonstrate that agentic autoformalization can scale to frontier research, offering both a practical tool for machine-verified peer review and a scalable engine for mining high-quality synthetic data to train future reasoning models. Our approach can also be generalized to any other rigorous research in mathematics and theoretical physics.

MerLean: An Agentic Framework for Autoformalization in Quantum Computation

TL;DR

MerLean addresses verification bottlenecks in frontier mathematical physics by automatically formalizing LaTeX statements into machine-checked Lean 4 code. It uses a bidirectional, agentic workflow where a frontier LLM translates LaTeX into Lean, then a second pass autoinformalizes the verified Lean back into LaTeX for semantic review. On three quantum computing papers, MerLean produced 114 statements and 2,050 Lean declarations, introducing necessary definitions and axioms when Mathlib lacked corresponding results, and completing end-to-end formalization within under 42 hours. The approach demonstrates practical utility for machine-verified peer review and data generation to train future reasoning models, with potential to generalize to other rigorous domains in mathematics and physics.

Abstract

We introduce MerLean, a fully automated agentic framework for autoformalization in quantum computation. MerLean extracts mathematical statements from \LaTeX{} source files, formalizes them into verified Lean~4 code built on Mathlib, and translates the result back into human-readable \LaTeX{} for semantic review. We evaluate MerLean on three theoretical quantum computing papers producing 2,050 Lean declarations from 114 statements in total. MerLean achieves end-to-end formalization on all three papers, reducing the verification burden to only the newly introduced definitions and axioms. Our results demonstrate that agentic autoformalization can scale to frontier research, offering both a practical tool for machine-verified peer review and a scalable engine for mining high-quality synthetic data to train future reasoning models. Our approach can also be generalized to any other rigorous research in mathematics and theoretical physics.
Paper Structure (31 sections, 2 equations, 2 figures, 2 tables)

This paper contains 31 sections, 2 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: MerLean architecture. Autoformalization extracts statements from a LaTeX paper, formalizes them into Lean 4, and verifies faithfulness. Autoinformalization translates the verified code back into a human-readable LaTeX blueprint or file.
  • Figure 2: Distribution of compile attempts by statement type for each paper (maximum 30 attempts before axiom phase). Most statements resolve within 1--10 attempts; theorems and lemmas require significantly more iterations.