Table of Contents
Fetching ...

Retrieval-Augmented TLAPS Proof Generation with Large Language Models

Yuhao Zhou

TL;DR

TLAPS proofs remain challenging due to hierarchical, multi-step reasoning. The paper introduces a two-phase approach that combines sub-obligation decomposition with retrieval-augmented generation to produce TLAPS-verifiable proofs, exemplified by the simple theorem $\A x \in Nat : Even(x+x)$. A practical implementation constructs TLAPS proofs and is evaluated on the Boyer-Moore Majority Vote algorithm, showing success for intermediate-obligation proofs but limitations for more complex theorems. The results indicate that LLM-assisted proof generation can meaningfully assist formal verification workflows, especially when integrated with TLAPS verification.

Abstract

We present a novel approach to automated proof generation for the TLA+ Proof System (TLAPS) using Large Language Models (LLMs). Our method combines two key components: a sub-proof obligation generation phase that breaks down complex proof obligations into simpler sub-obligations, and a proof generation phase that leverages Retrieval-Augmented Generation with verified proof examples. We evaluate our approach using proof obligations from varying complexity levels of proof obligations, spanning from fundamental arithmetic properties to the properties of algorithms. Our experiments demonstrate that while the method successfully generates valid proofs for intermediate-complexity obligations, it faces limitations with more complex theorems. These results indicate that our approach can effectively assist in proof development for certain classes of properties, contributing to the broader goal of integrating LLMs into formal verification workflows.

Retrieval-Augmented TLAPS Proof Generation with Large Language Models

TL;DR

TLAPS proofs remain challenging due to hierarchical, multi-step reasoning. The paper introduces a two-phase approach that combines sub-obligation decomposition with retrieval-augmented generation to produce TLAPS-verifiable proofs, exemplified by the simple theorem . A practical implementation constructs TLAPS proofs and is evaluated on the Boyer-Moore Majority Vote algorithm, showing success for intermediate-obligation proofs but limitations for more complex theorems. The results indicate that LLM-assisted proof generation can meaningfully assist formal verification workflows, especially when integrated with TLAPS verification.

Abstract

We present a novel approach to automated proof generation for the TLA+ Proof System (TLAPS) using Large Language Models (LLMs). Our method combines two key components: a sub-proof obligation generation phase that breaks down complex proof obligations into simpler sub-obligations, and a proof generation phase that leverages Retrieval-Augmented Generation with verified proof examples. We evaluate our approach using proof obligations from varying complexity levels of proof obligations, spanning from fundamental arithmetic properties to the properties of algorithms. Our experiments demonstrate that while the method successfully generates valid proofs for intermediate-complexity obligations, it faces limitations with more complex theorems. These results indicate that our approach can effectively assist in proof development for certain classes of properties, contributing to the broader goal of integrating LLMs into formal verification workflows.
Paper Structure (34 sections, 1 equation, 8 figures)

This paper contains 34 sections, 1 equation, 8 figures.

Figures (8)

  • Figure 1: Example Input to the System
  • Figure 2: Expected Output from the System for the Input of Figure \ref{['fig:example_input']}
  • Figure 3: System Overview
  • Figure 4: Template of Sub-proof Obligation Decomposition Query
  • Figure 5: Template of Iterative Refinement Query
  • ...and 3 more figures