Towards Language Model Guided TLA+ Proof Automation
Yuhao Zhou, Stavros Tripakis
TL;DR
The paper tackles the bottleneck of proving theorems in the expressive $TLA^+$ proof system by introducing Language Model Guided Proof Automation ($LMGPA$), which uses LLMs to hierarchically decompose complex proof obligations and relies on symbolic backends ($TLAPS$) for rigorous verification. A recursive algorithm decomposes goals into sub-claims, with a normalization step that minimizes syntax errors and a feedback loop informed by backend prover results. Empirical evaluation on a 119-theorem benchmark shows that $LMGPA$ improves proof success rates over naive baselines and yields higher syntactic validity in LLM-generated decompositions. The work demonstrates a practical path to scalable formal verification for distributed systems, balancing reasoning power of LLMs with the rigor of symbolic provers and establishing a reusable benchmark for future research.
Abstract
Formal theorem proving with TLA+ provides rigorous guarantees for system specifications, but constructing proofs requires substantial expertise and effort. While large language models have shown promise in automating proofs for tactic-based theorem provers like Lean, applying these approaches directly to TLA+ faces significant challenges due to the unique hierarchical proof structure of the TLA+ proof system. We present a prompt-based approach that leverages LLMs to guide hierarchical decomposition of complex proof obligations into simpler sub-claims, while relying on symbolic provers for verification. Our key insight is to constrain LLMs to generate normalized claim decompositions rather than complete proofs, significantly reducing syntax errors. We also introduce a benchmark suite of 119 theorems adapted from (1) established mathematical collections and (2) inductive proofs of distributed protocols. Our approach consistently outperforms baseline methods across the benchmark suite.
